Looking at the Data
I am fascinated by the rise of “Big Data”. It has become one of those “hot topics” and looks like continuing the trend in 2012. The ability to combine large amounts of data, both structured and unstructured, from private and public sources has the potential to have a profound impact on a range of businesses (and is already in some cases).
So I am learning about Hadoop and Cassandra and Amazon Compute Clusters and k-means clustering and…..
All of this crunching gives me more data. Some of it lovely scatter plots, or charts that I can put trendlines on, and but in the end I am just ‘looking at the data’.
Now I have spent many years “looking at the data” in many different contexts, and it has always amazed me what a ‘black art’ it is. It seems that good data insight comes from an experienced person who understands both the data and the business staring at these charts like some medieval soothsayer staring at lizards gizards before incanting their insights. (For another metaphor, read ‘Data Alchemists’.)
I think one of the reasons that it is such a black art is that the most interesting ‘Insights’ are those that are not part of the collective psyche at that particular point in time. And knowing what is not known seems to be particularly challenging.
Information addicts
I am forever being frustrated by my own ability to spend waste huge amounts of time ‘reading’ the internet believing that this is productive. And so it seems are many others, for example, here and here
From time to time I get “motivated” and re-organise my reading habits. Sometimes I cull my regular feed lists, sometimes I limit reading to certain hours of the day, sometimes I attempt to disconnect completely for a day or more, sometimes I make a conscious effort to substitute ‘doing’ for ‘reading’. All of these are worthwhile, and work for a while…. but inevitably, like some sort of drug, it creeps its way back into more and more of my online time.
I think this experience is pretty common, and I know many have written strategies to help address this (I have read them after all!!). But what I haven’t seen discussed much is why it is such a powerful force. What is the neurological/psychological basis for this ‘addiction’. And it does seem to feel like it is some form of addiction. I am reminded of some research into ‘random rewards’ creating addictive behaviours. Does that help explain what is going on? If so, what is it about ‘reading online’ that triggers the shot of serotonin in your brain. Food, or physical drugs I can understand…. but why reading?
The Network vs the Computer
I read an article today (that I lost the reference to) that talked a bit about ‘the network’ – and it got me thinking about my earlier post about ‘Humans and the network’. I almost called it ‘Humans and computers’, but somehow ‘the network’ seemed more appropriate. Reading this other post today made me think that this distinction is in fact a big one.
It is only in the last couple of years really – that we have started to interact with ‘the network’ independent of ‘the computer’. For this to happen two things need to be true. Firstly, you must be able to interact with this thing via multiple devices in a seamless fashion. This was starting to become true with laptops and desktops c.2005 – but really started properly with the release of the iPhone and subsequent devices that could browse the internet – significantly better than anything before it. (Remember, that was only 4 years ago!!) The second thing needed for us to interact with ‘the network’ – is true two way interaction. The ability to be able to create as well as consume content – on the fly – and for that content to appear on sites like Facebook or Twitter and then directly impact your network – which can then respond in kind – that is real interaction.
So today – in 2011 – it seems to me that our interactions are no longer with ‘the computer’ but are directly with ‘the network’. To some this just means being able to post and view Facebook on their phone and their laptop. But we have only scratched the surface. In the next few years it will mean much more than this. For example, in the days of the relatively static internet, the internet is just one thing – your internet and my internet are more or less the same. But when we have a fully dynamic and direct interaction with the network – then your network and my network begin to look very different (just like our real world networks are very different). Our social networks, our search results, the ads being displayed to us, the products being displayed to us by Amazon – even colour schemes and news items – all start to become unique. (This gives rise to the much talked about ‘Filter Bubble’.)
To put it another way, we have come to understand that computers are good at many things… they give the same result to calculations all the time, they never get bored, they never forget, they essentially do what we tell them to do – nothing more and nothing less. But these things are not true of ‘the network’. Have you ever posted something to Facebook only to see it disappear a short while later? Have you ever had some email from a trusted source suddenly get routed to your Spam folder? The ‘network’ does forget… and it does make mistakes. ‘The network’ and ‘the computer’ are very different things indeed – and over the next couple of years – we will interact more directly and transparently with the network itself – and we will be amazed.
Humans and the Network
I haven’t posted for a while…. natural wax and wane of projects I guess. My previous project (http://www.skorebug.com) has tapered off somewhat, and so had my posts here. But now I think it is time to broaden the scope of what I write about and see where that leads.
The thing that has been on my mind somewhat lately is the fit (or otherwise) between humans and the network. This is clearly a big (and possibly unproductive 😉 topic – but I thought it worth trying to focus some of my thoughts “on paper” (“in a database somewhere in the cloud” doesn’t have quite the same ring).
The ‘net brings a huge deluge of information to us every second. It seems that I spend stupid amounts of time ‘filtering’. Between news sites, old fashioned feeds, twitter, email and so on – I spend way too much time each day trying to ‘keep up’ and an inordinate fraction of time seems to be spent weeding through chaff to get to the gems.
There are some great services that attempt to make this ‘better’Â (Summify is one of my favourites) – but invariably they end up being ‘yet another data source’ in my daily routine and I am left strangely unsatisfied.Â
In contrast, if I am ever away from the computer for more than a few days, the backlog of reading becomes mountainous. I let the mountain persist for a few days or even weeks – until eventually I realise that there is no way I am even going to get through this mountain… and I just press ‘delete’ – the mountain disappears in an instant and bizarrely it is never missed – and I have a fleeting insight into the futility of the the effort.
But inevitably I am drawn back into the cycle… as are thousands of others, reading blogs and twitter feeds and facebook updates – persisting despite sensing that it is somewhat unsatisfying. There is no end of ‘self help’ style advice: turn off your email, schedule fixed times, cull regularly. These are all things that I can do. But what I have been wondering lately is ‘what can the network itself do to help?’
A heisenbug
Just to vent my frustrations a little bit I thought I would post a heisenbug that hit me today. Today, I realised that some pages on my production site were throwing javascript errors (jQuery Mobile / Rails 3.1). The same site on my development machine was working fine. Of course being a production site, debugging the javascript to understand what was different proved difficult. Looking at the docs on the asset pipeline (http://guides.rubyonrails.org/asset_pipeline.html) I noted a nifty little feature which is the ability to add ?debug_assets=true to the end of a url to enable debugging of the asset pipeline.
Lo and behold, as soon as I turn this on, the javascript errors disappear and the site works properly again. I am sure this is indicative of some deeper underlying problem with what I have set up – but I haven’t been able to trace it down yet.
Omniauthable – my version
So having had a bit of a rant about social authentication – and how few good examples I was able to find of how to use Devise’s Omniauthable module with multiple providers – I wanted to post here how I handled it. There are many pieces of this code copied directly from other sources. These sources are all referenced in my earlier post here.Â
Firstly, the heart of it, as per the Devise wiki is users/omniauth_callbacks_controller.rb. The other key files in my case are the user model (user.rb) and the devise config file (config/initializers/devise.rb). The code is below. Some points worth noting:
- I had to modify devise to allow login either via username or email in order to support twitter (and not force users to have to provide an email after authenticating). This is mostly covered in the Devise wiki here.
- In order to allow those authenticating via oauth to edit their user settings (when they have no password) I had to follow this.
- Since accounts can be edited without a password, I removed the ability to edit password via the user registration form. The only way to alter password was then via the ‘forgotten password’ mechanic via email. (So for example if after first authenticating via twitter, and the user wants to configure a password to log in directly, then they will have to first provide an email before updating their password. This is a touch clumsy but still more intuitive than forcing twitter users to enter their email just to complete the initial authentication.)
- Adding new providers is now pretty straightforward. The provider specific code is mostly confined to the provider_user_hash and provider_auth_hash blocks at the bottom of omniauth_callbacks_controller.rb
Probably still much could be improved here – but it is the closest I have gotten so far to a ‘seamless social authentication’ using Devise and Rails.
Social authentication on Rails 3.1 – a rant!
Over the past couple of months I have been developing a (rails 3.1) web application (shameless plug www.skorebug.com). Since it is 2011, I figured it needed to be mobile (enter jquery mobile) and social. For the last week or more I have really been battling with ‘social’. In particular authentication using Facebook and/or Twitter credentials (oauth). This post is about some of the issues that I uncovered that I couldn’t find covered anywhere else. I hope to actually post the code I ended up with in a later post.
I listed in a recent post all of the links I found and tried following in developing this stuff. But across all of them there were many shortcomings. What I wanted was:
- Support for Devise’s :omniauthable. It seemed from the Devise wiki entry on the topic that if I could get this working – it would be much less code to support on my part. Although in all honesty I am still not 100% clear of the benefit of using omniauthable, over Ryan’s approach
- Graceful handling of the absence of email in Twitter’s oauth response. I wanted as little friction as possible. This meant coping with users who did not provide email if they authenticated via Twitter (but the site should also allow the user to enter their email at a later date to access email specific features of course).
- Save oauth tokens for accessing respective Twitter and Facebook APIs (for example posting to the users profile)
- Allowing users to edit their details even if they authenticated via Twitter/Facebook and therefore did not have a password on my site
- Update ‘user’ details from oauth in a sane manner (e.g. Use the ‘name’ and ‘email’ from Facebook’s oauth response for the user record if these are blank)
- Constrain the social network specific code to as small as possible so that support for future networks is sane.
- At a minimum – support for both Twitter and Facebook
I was genuinely surprised that there wasn’t for in the way of examples available on the intertubes for these. Having worked with Rails for some months now – one of its great strengths is the huge amount of resources available. On almost any topic you can normally find great tutorials and articles – but for whatever reason – this area seemed a little short of great reference articles that covered what I considered to be the ‘basics’ of seamless social authentication.
The fact that Twitter does not provide email in their oauth response is probably the most significant cause of heartache. It seems that most of the articles I uncovered decide to make this ‘the users problem’ by popping up an extra screen to ask for an email address. But this is not a user problem – this is an application design problem to be solved by application designers and developers.Â
Devise, Omniauth and Omniauthable
So anyone who has wanted to integrate Devise and Omniauth recently has probably come across the recent Railscasts on the topic, here and here. However this was for Devise prior to 1.2 at which point the Devise guys introduced the :omniauthable method.
The problem, as highlighted by this stackoverflow question is that it is quite hard to come across a good example of Devise’s omniauthable set up with multiple providers. After much searching, the closest I could find was here. (There is another example here. It didn’t specifically include multiple providers but the structure seemed to allow for it more so than the example on the Devise wiki.)Â
So my next gripe with what I had found to this point was that it didn’t handle the case of twitter authentication very well. Since twitter doesn’t return an email, the ‘Railscasts’ approach was to ask for an email upon signing up. This seemed too high friction for me and I wanted something that allowed users to register via twitter without having to provide any extra information.
So I was just in the process of trying to meld all of this together into my own solution for this post, when ‘one more Google search’ (after many) yielded a wonderful gist here. This finally seems to answer most of my issues in a sane manner. I am about to give this a try now…. fingers crossed……
Web App Deployment
Have battled to deploy my first Rails 3.1 / jQuery Mobile site over the last couple of days. It is finally live at www.skorebug.com (the jQuery Mobile but begins when you try and log in!)
A couple of things that bogged me down for ages:
- How to deploy? The Rails world seems to love Capistrano – but for a single developer/single server website it seemed overkill. Was tempted to just manually ftp files to my server but that made me feel yucky. Finally found a nice compromise using git. Probably millions like it – but this blog post got me started:Â http://pixelhum.com/blog/using-git-for-deployment
- Setting up the server? Part 1: How to actually install Ruby, Rails, Rubygems etc. I had used rvm in dev and liked the idea of using it too. Battled with a bunch of things till I uncovered this: https://github.com/joshfng/railsready It was an absolute life saver and took care of all the ‘boring bits’ nicely 😉
- Setting up the server? Part 2: Where to put the files (used it in dev – seemed like a good idea) and a bunch of other questions. The Rails community seems to have great materials on how to get started with the dev bit – but I really struggled to find good resources about deployment (that weren’t 60 pages long). The secret that didn’t seem well documented is to point your web server at the /public directory of your Rails app (and essentially copy everything from your dev environment to production). This seemed counter-intuitive at first – since on Rails 3.1 I had never touched the /public directory (other than to delete index.html 😉 I guess that is the price you pay for starting to learn Rails 3.1 whilst it is still unreleased.
- X-Sendfile!!! At this point the basic app was working nicely except for images. It puzzled me for ages – but eventually discovered that X-Sendfile had to be configured for my web server (Apache in this case). Again something that was not obvious from the documentation. Even the apparent option on config/environments/production.rb to disable X-Sendfile didn’t seem to work as expected. So yes – you need to make sure that X-sendfile is installed on your web server and enabled for each of your Apache VirtualHosts. I did find a reported ‘Issue’ under the Rails project on github – it seems that something has changed in 3.1 in this regard – but as a n00b – it made no sense whatsoever 😉
So finally – a process that should have taken a couple of hours was finished after a couple of days – but at least it is finished!!
(Note the site itself is functional – but far from “finished” – so if you are one of the three people in the world that read this post and go to the site (Hi Mum!!) – please be gentle 😉
Recent Comments