Archive for the ‘Research’ Category

How attractive is your website?

Monday, August 18th, 2008

I was trying to analyze the feedback on my website’s new design. There seems to be a trend that relates their usage of the website with their feedback.

While researching on this subject, I found a paper by three people affiliated with the University of Manchester, UK. The paper makes three interesting hypotheses that are eventually proved in their paper:

  1. User preference will be determined by interactions between decision criteria and subject background, specifically design-training and aesthetics, culture and identity.
  2. User intentions will be determined by interactions between decision criteria and the task context; specifically, serious use will favor usability and content, less serious use will favor aesthetics.
  3. User judgment will be determined by interactions among decision criteria; specifically, positive aesthetics will over-rule poor usability.

They randomly asked students to consider three departments for either a one-month summer internship or a five-year PhD. Based on this, they were asked to judge the department websites. The three departments were under the same university, Stanford - the Design department, the HCI website and the D-School website.

What was interesting to note was that most of them rated the D-school best when asked to consider the one-month summer internship. But when the task was shifted to the five-year PhD, they all rated the HCI website better! All other constraints remained unchanged - the same university, the same websites, the same variation in backgrounds of people, etc.

From my understanding of the results, people prefer less-aesthetic websites for serious/regular usage . Perhaps this explains why advanced users prefer Gmail vs Yahoo! Mail - one focuses on simplicity and elegance while the other focuses on usability and attractiveness.

On the other hand, the study “suggests that users’ overall impression of a website could be a determinant of user satisfaction and system acceptability, even overcoming poor usability experience and poor content”

Perhaps this explains why we are okay with a not-so-great UI on the IRCTC.co.in website but still use it because it has great value since it solves a “critical” issue of buying train tickets. Yet, we wouldn’t have tolerated this kind of UI for other purposes. For example, such a UI could have never worked for a survey website or a form-builder. That’s exactly why Wufoo.com has to have such a great UI.

This reminds me of an amazing talk by Geoffrey Moore in an internal Adobe conference. He explained the different types of innovation : product leadership, customer intimacy and operational excellence, which in turn have four types each. The trick for a good company is to have aligned vectors of innovation where they have to excel, and non-aligned vectors of innovation where they have to be “good enough”.

So, in terms of websites, ideally, a website should have to either excel at content and service and be good enough at the aesthetics, or should excel at aesthetics and be good enough at content and service. It does NOT need to excel at both (but of course, it’s good if you can).

Super Crunchers

Monday, June 23rd, 2008

Today, I re-read a book called Super Crunchers: How Anything Can Be Predicted by Ian Ayres.

So what is supercrunching?

Now something is changing. Business and government professionals are relying more and more on databases to guide their decisions. The story of hedge funds is really the story of a new breed of number crunchers - call them Super Crunchers - who have analyzed large datasets to discover empirical correlations between seemingly unrelated things. Want to hedge a large purchase of euros? Turns out you should sell a carefully balanced portfolio of twenty-six other stocks and commodities that might include Wal-Mart stock.

What is Super Crunching? It is statistical analysis that impacts real-world decisions. Super Crunching predictions usually bring together some combination of size, speed and scale. The sizes of datasets are really big - both in the number of observations and in the number of variables. The speed of the analysis is increasing. We often witness the real-time crunching of numbers as the data come hot off the press. And the scale of the impact is sometimes truly huge. This isn’t a bunch of egghead academics cranking out provocative journal articles. Super Crunching is done by or for decision makers who are looking for a better way to do things.

This is best explained by the chess example:

We tend to think that the chess grandmaster Garry Kasparov lost to the Deep Blue computer because of IBM’s smarter software. That software is really a gigantic database that ranks the power of different positions. The speed of the computer is important, but in large part it was the computer’s ability to access a database of 700,000 grandmaster chess games that was decisive. Kasparov’s intuitions lost out to data-based decision making.

(emphasis mine)

The book starts off with the example of Orley Ashenfelter, a Princeton economics professor as well as founder and editor of the Journal of Wine Economics who wanted to apply supercrunching techniques to predict whether a wine from a particular year would be a good wine or not. He ended up with the following equation:

Wine quality = 12.145 + 0.00117 winter rainfall + 0.0614 average growing season temperature - 0.00386 harvest rainfall

You can imagine the commotion that followed. The wine experts brushed off this theory and that numbers can predict the wine quality better than they can. After all, “Just as it’s more accurate to see the movie, shouldn’t it be more accurate to actually taste the wine?”

And yet, the equation did indeed make better predictions, especially with the prediction that 1989 and 1990 wines would be bestsellers.

(more…)

Cut down that movie

Thursday, May 29th, 2008

How would life be if you could tell your computer to cut down a 3-hour movie to one hour?

Sounds impossible?

From what I understand of this paper called “Feature fusion and redundancy pruning for rush video summarization” by the people at the Vision Research Laboratory at UCSB, it is very much possible!

The basic idea is to find ‘distinctive’ parts of the video, for example, someone talking at a high pitch or lots of moving scenes which, intuitively, would be more important than a slow scene or repeated shots.

They consider multiple facets of the video such as speech, camera motion, significant differences in color, suppression of repeated scenes and of course, identification of visually distinct segments.

The caveat is that their test data set are drama “rushes” video which are raw footage including the clapboards, the color tones, repeated takes, etc. This is very conducive to such an algorithm, which could probably explain why they had such good results (details are in the paper).

But if this is the state of things today, I can imagine that around five years down the lane they would really be applying it to commercial movies and television shows. It is amazing on what can be done with a combination of mathematics, statistics and computers.

Interestingly, the final summaries were around 4% of the total video length. If this was applied to the 8-year long Kyunki Saas Bhi Kabhi Bahu Thi show, I wonder how much it would be reduced to…


Update : Now Microsoft Research has done it for audio as well!

Why does crowdsourcing work?

Monday, February 25th, 2008

Tim O’Reilly’s definition of Web 2.0 makes it clear that “crowdsourcing” is one of the defining features of Web 2.0, not only RIAs:

“The service automatically gets better the more people use it.”

Crowdsourcing is about taking it to the next step where people ‘contribute’ something to the ’system’.

There are many people and companies trying to make crowdsourcing work in different areas. For example, at Kluster, the participants get to design a product, etc. and the participants who back the winning idea get to share the reward. What is interesting is the story behind Kluster:

Kaufman came up with the idea for Kluster at his last startup, Mophie, which makes iPod accessories and was recently sold to mStation for an undisclosed sum. One of Mophie’s hit products is the Bevy, an all-in-one iPod Shuffle case, bottle opener, cord-wrap, and keychain. The company designed it at last year’s MacWorld conference in 72 hours with input from 30,000 customers using software that was a precursor to Kluster. According to Kaufman, Mophie sold hundreds of thousands of the $15 cases.

And from the June 2006 Wired magazine article:

Melcarek (a registered user at InnoCentive.com) solved a problem that stumped the in-house researchers at Colgate-Palmolive. The giant packaged goods company needed a way to inject fluoride powder into a toothpaste tube without it dispersing into the surrounding air. Melcarek knew he had a solution by the time he’d finished reading the challenge: Impart an electric charge to the powder while grounding the tube. The positively charged fluoride particles would be attracted to the tube without any significant dispersion.

“It was really a very simple solution,” says Melcarek. Why hadn’t Colgate thought of it? “They’re probably test tube guys without any training in physics.” Melcarek earned $25,000 for his efforts. Paying Colgate-Palmolive’s R&D staff to produce the same solution could have cost several times that amount – if they even solved it at all.

More examples are:

  • Dell Idea Storm where customers vote for what products they want Dell to do next - this is how Dell’s recent introduction of Linux laptops happened.
  • Get Satisfaction which is “people-powered customer service”
  • Intel asking the crowd on what is the next Google
  • MicroPledge and co fund os where people pledge their money for software ideas they like, once a good amount is reached, someone takes up that pledge and works on it. If he/she completes it successfully, they get the money and the crowd gets the software they want. This is the crowdsourced version of a bounty.
  • Sell-a-Band where people pledge their money on bands they like. Sufficient money implies the band gets to record an album with that money. If the album sells, the crowd, the band and the SellaBand website share the profit.
  • Kiva for microfinance loans to entrepreneurs in developing countries.
  • Wesabe for personal finance.
  • CrowdSpirit for electronics.
  • Threadless for T-shirts.
  • Everywhere Mag for a travel magazine.
  • Crowdsourcing.com is crowdsourcing a book on crowdsourcing. Say that fast thrice.
  • We can also include Youtube under the entertainment category.
  • And many many more.

Heck, we even have an O’Reilly book on ‘Programming Collective Intelligence’ (which has been sitting on my to-read list for too long).

The biggest and best example, of course, is Wikipedia, one of the top 10 largest websites in the world.

The article that blew my mind (and got me wondering about crowdsourcing in the first place) is the Wikipedia page on British crown succession (via IndiaUncut) - this page lists 1388+ people who are in the succession line for the crown!

But I wonder, why did Wikipedia work? Or rather, what makes people contribute to Wikipedia?

The best research on this topic that I found was the article What Motivates Wikipedians? in the CACM monthly magazine:

What motivates Wikipedians?

I wonder if the companies mentioned above are specifically tapping into some of these motivations.

The article goes on to explain the relative importance of these motivations in their survey. I was seriously surprised at how high Ideology and Values rank here! If you get a chance, do read the whole article, it’s a good piece of research.

Another interesting research was the paper Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia which traces how a casual visitor starts reading Wikipedia and goes on to become a member of the community, and how the social structure and technological aspects enable this.

I think I’m now beginning to understand what Jimmy Wales (founder of Wikipedia) said when he was asked the same question:

Love. It isn’t very popular in technical circles to say a lot of mushy stuff about love, but frankly it’s a very very important part of what holds our project together.

I have always viewed the mission of Wikipedia to be much bigger than just creating a killer website. We’re doing that of course, and having a lot of fun doing it, but a big part of what motivates us is our larger mission to affect the world in a positive way.

Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That’s what we’re doing.

Although this reasoning may apply to Wikipedia which is an encyclopedia and information-centric, I wonder whether the same applies to the other examples above. For example, consider Threadless.com for T-shirt designs… what are the motivations for people in that community? And how much does the website’s social and technological structure play a role? What are the magic ingredients that make a crowdsourcing website become successful?

Maybe I should crowdsource this question. Hmmm.

Maybe it is not different from any other kind of website which becomes successful but I think crowdsourcing websites are distinct from content websites like SmashingMagazine.com or e-commerce websites like Amazon/eBay, etc.

Now, the next question is has anybody successfully crowdsourced anything in an India-specific way?


Update on 2008 May 13: ReadWriteWeb has a similar list.

Innovation in Indian universities?

Wednesday, January 2nd, 2008

A while ago, I was asking myself Where are the killer applications on the web for India?

Today, when I read ReadWriteWeb’s article on The State of Innovation in India, a thought struck me about the relationship between innovation and universities. Everyone knows the story of about how many companies like Yahoo!, Google, Sun Microsystems all started at Stanford University, how FreeBSD came out of Berkeley University, and so on. I hope you also know how the great Nalanda University in the 5th century was a hotbed of advancements (more on that in another story).

Is it that a strong ideas culture is instilled only in a good university environment and the ecosystem around it which includes startups and businesses? Perhaps this explains why there is such amazing stuff being incubated at the TeNeT, IITM.

It reminded of an article by Prabhakar Raghavan, Head of Yahoo! Research where he says:

India’s real infrastructure problem–with no solution in sight–is not airports or electricity; it is the virtual nonexistence of graduate education and research in information and other crucial technologies. Consider this for starters: The U.S. produces about 1,400 Ph.D.s in computer science annually and China about 3,000. By stark comparison, India’s annual computer science Ph.D. production languishes at roughly 40. That number is about the same as that for Israel, a nation with roughly 5% of India’s population size.

Now you may ask why is this important? That is best explained by C.N.R. Rao, Science Advisor to India’s Prime Minister speaking about why money is spent on moon rockets when there is poverty to address:

You cannot be industrially and economically advanced unless you are technologically advanced, and you cannot be technologically advanced unless you are scientifically advanced.

Amen.

SQL and XML are not that different

Monday, May 9th, 2005

About a year ago, I had presented my 8th semester presentation on Xen, now called C Omega. It is a language that combines SQL, XML and OOP into one tight language. The paper that proposed this language was named Programming with Circles, Triangles and Rectangles. The circle represents the encapsulation behavior of objects and OOP, the triangle represents the tree structure of the XML and the rectangle represents the tabular structure of databases.

Video of Anders Hejlsberg talking about C# 3.0

I recently came across Anders Hejlsberg’s interview on Channel 9 regarding programming data in C# 3.0 and it looks like C-Omega is going to be ‘merged’ into 3.0. Its amazing that MS has taken this concept (which seemed totally radical to me when I first read about it) to production quality and is actually going to make this a core part of their platform.

Let us consider an example of using C-Omega. Suppose you want to handle books in a program used to manage libraries. Then you could write a book class using C-Omega as

[code] public class book { sequence { string title; choice { sequence{ editor editor; }+; sequence{ author author; }+; } string publisher; int price; } attribute int year; } [/code]

The cool part is that the above same class can be used to store the data either as XML or in a relational database. You can also instantiate an object using XML syntax:

[code] book b = SwaroopC H www.byteofpython.info 250 ; [/code]

Note that this syntax is still static typing. Needless to say, the C-Omega compiler must be one heck of a monster.

The Python connection is that the C-Omega-ish method of access will probably be included into IronPython at some stage. Even if that doesn’t happen, we already have Pythonic ways of doing XML as pointed out long ago by wspace.

If you have ever written a program that uses databases, I highly recommend reading the Circles, Triangles and Rectangles paper. It just might change the way you think about databases and SQL, or even XML for that matter.

You can also download that old presentation of mine on Xen.