Thursday, February 07, 2008

Money:Tech 2008 Panel


I sat on the "Building a Better Information Beast" panel at O'Reilly's Money:Tech conference this week in NY. Other participants were Randall Winn of Capital IQ, Kevin Pomplun of SkyGrid, Renny Monaghan of Salesforce.com and moderated by Rob Passarella of Bear Stearns. The panel was a good one and Rob was a very engaging moderator. I'm still not quite sure why Salesforce was on the panel but what the hell right? Capital IQ is an obvious choice for a next generation Bloomberg. They have most of the data, can acquire what they don't have, can move faster than Bloomberg, smart guys, etc. SkyGrid is in the midst of closing their A round and want to be the unstructured information aggregator, probably most directly competive with InfoNGen. Near the end of our discussion Rob asked the panel how all of this would come together. Would companies need to assemble it themselves out of the pieces or will someone emerge that will bring it all together. The obvious answer is the latter. It's too big of a market, too big of an opportunity for someone not to put this all together. The big question is when. The data companies need to make their data more open and "mashable." and then will need some way to link context between the data sets. The obvious answer for the financial market is by ticker+exchange or some more consistent code like a cusip since tickers change from time to time. This is a fairly complicated problem with unstructured data however since you need to determine whether what people are talking about should be assigned to a specific ticker. For instance if someone is talking about Chinese manufactured pet food, should you assign this to all pet food tickers? Should you just assign to the pet food tickers that private label the Chinese manufacturer's product? Should you not assign to a ticker at all? As it turns out contextual integration is fuzzy. There needs to be a variable association from tight to loose that we commonly refer to as relevance in search. You can also think of this as "relationship strength." We've attempted to do this with our ranking engine at Collective Intellect but in order for this to work in a mashable way we will all need to agree on a common key in the information taxonomy and a normalized value for relationship strength in order to bring all of this information together in a meaningful way. We're getting there and work at companies like Metaweb are helping to think through this but I'm afraid we're still years away.

No comments:

pull your banner ads until google does a better job