Launch webinar!

So you have data. Tons of it. Some of it is easily quantifiable, organizable, and analyzable.

But a lot of that data is in the form of complex text –sentences that contain ideas and insights a computer couldn’t understand. Until now.

We’re introducing Luminoso (www.luminoso.com), a new text analytics engine that understands what people are saying the way we understand each other.

Developed by MIT Media Lab alumni, Luminoso’s software is built on a structural foundation of common-sense reasoning. Because it contains a massive body of cognitive understanding, Luminoso identifies the nuances and colorations of meaning in any written text.

It connects the dots and draws out patterns and associations not possible before, because it grasps the conceptual meaning of language.

Let us surprise you with a technology that knows the difference between “a webinar you won’t want to miss” and “a webinar that everyone should miss.” Luminoso’s launch webinar on Tuesday, April 16th, 2 PM EDT is definitely the former. Join us and see and see how we can put your text in context.

Registration URL:
https://attendee.gotowebinar.com/register/7195588171277305600

Webinar ID: 135-448-403

UPDATE: Our webinar provider appears to be down right now. We hope you’ll remember us and come back when the link is working. In the meantime, you can join our mailing list on our home page.

UPDATE 2: Webinar’s back up!

SXSW and the obligatory Twitter demo!

Many of us are at South by Southwest. We’re picking up a lot of leads who want to do interesting pre-launch pilot projects with us. (Perhaps too many!) We’re also seeing a lot of performances by excellent bands you probably haven’t heard of yet, which is a very important thing to do at SXSW.

It is of course an Austin city ordinance that every tech company presenting at SXSW must have a Twitter demo.
So we have one. It’s what we’ve been using to show people how our system understands topics, and how it can do so using real-time data.

Basically, suppose you were to try to follow the hash tag #sxsw. It’s a mess. There are tweets going there faster than you can reasonably read them, and many of them are kind of crap of the form “retweet this for your chance to win a prize!”.

So our demo, sxsw.luminoso.com, does two things: it completely removes tweets from the stream that appear to be uninteresting or spammy, and out of the remaining tweets it shows you the ones that appear to correspond to selectable topic areas such as music, film, food, and (perhaps most relevantly) parties.

When watching another company’s SXSW Twitter demo, I noticed that they emphasized things that were being retweeted a lot — and that every one of them was one that we would want to filter out. The people retweeting weren’t participating in an authentic conversation. They were just being promised something in exchange for retweeting. This has no value to anyone else reading the tweet.

This demo of course works using our fuzzy semantic matching, not exact keywords. I was happy to see it quickly learn that “Spotify House” is referring to a music event, for example, with most of the tweets about it not including anything like the word “music”.

Now if only the demo could tell you which music to listen to. But actually, it seems to know to leave that up to NPR.

How to make an orderly transition to Python Requests 1.0 instead of running around in a panic

There’s a lovely Python module for making HTTP requests, called requests. We use it at Luminoso. A bunch of code we depend on uses it. Our API customers use it. Basically everyone uses it because it’s the right thing to use.

Yesterday we did our first code update of the new year on our development systems, and found that suddenly nothing was working. Meanwhile, our customers sent us bug reports for similar reasons. We’d see errors like this one:

TypeError: session() takes no arguments (1 given)

And all kinds of code would crash with this kind of error, which occurs because Requests changed .json from a property to a method:

TypeError: 'instancemethod' object has no attribute '__getitem__'

You see, on December 17, Kenneth Reitz released version 1.0 of requests and declared “This is not a backwards compatible change.” As far as we can tell, this has caused a small ripple of version-related panic in the Python world. We know it’s okay to break compatibility when changing the major version number. That’s what major version numbers are for. But the problem is that it’s really hard to deal with multiple incompatible versions of the same Python package.

If you were to type pip install requests now, you’ll get version 1.0, and it won’t work with most code written for version 0.14. So maybe you should ask for “requests < 1.0″ or “requests == 0.14.2″, and maybe even declare that dependency in setup.py. That was certainly the stopgap measure we went around applying yesterday.

The problem is that, once you do that, you can’t ever upgrade to Requests 1.0 or install any code that uses Requests 1.0, unless you port all your code and update all your Python environments at once. Not even virtualenv will help. You just can’t have an environment that depends on “requests < 1.0″ and “requests >= 1.0″ at the same time and have your code keep working.

The requests-transition package

We want to make it possible to move to the shiny new Requests 1.x code. But we
also want our code stack to keep working in the present. That’s the purpose of
requests-transition. All it does is it installs both versions of
requests as two different packages with different names.

The slogan of requests is “Python HTTP for Humans”. The slogan of requests-transition is “Python HTTP for busy people who don’t have time to port all their code yet”.

To install it using pip:

pip install requests-transition

Now you can stabilize your existing code that uses requests 0.x by changing the line

import requests

to

import requests0 as requests

When you port the code to use requests 1.0, change the import line to:

import requests1 as requests

In the future, when all your dependencies use requests 1.0 and 0.x is a distant memory, you should get the latest version of the real requests package and change the import lines back to:

import requests

And that is how you transition to requests 1.x, calmly and painlessly.

We have already updated our API client code to use requests-transition, instead of forcing you to install “requests < 1.0".

Watch python-requests-transition on GitHub

Nurble nurble

A Saturday Morning Breakfast Cereal comic suggested that political speeches can be improved by replacing everything but the nouns with “nurble”. So we did so.

Being unable to resist putting a bit of semantics into everything, we also scored every noun that remains for how “political” it is, based on how much it matches a Luminoso vector made from all the Presidential inaugural addresses. The average is plotted on a “politic-o-meter”.Nurble vice president, nurble nurble justice, nurble nurble citizens, nurble nurble nurble humility nurble honor nurble nurble nurble people nurble nurble nurble nurble. nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble welfare nurble nurble nation nurble nurble nurble peace nurble nurble world.

We hope you enjoy our weighty contribution to the political discourse. And remember to nurble next Tuesday!

Webinar on Web Data for Market Research

A few months back, we were introduced to a company called Connotate which makes web content monitoring software — that is, a great tool for getting information out of webpages and into a workflow. Since so much of the web is text and we’re all about analyzing and making sense of text, they’re basically bacon for our chocolate.

Our joint efforts with Connotate originated in a client project: we were working to deliver a breakthrough Voice of the Customer solution for a major consumer packaged goods manufacturer. Connotate proved to be the ideal solution for gathering unstructured product review comments and transforming them into a usable, structured format. We’ve used them in several other contexts since, formed a proper partnership, and are doing a webinar on the use of Web Data in Market Research next Thursday, 11/1, at noon Eastern time. It’s going to be pretty cool. The Webinar will feature Dennis Clark, Luminoso’s Chief Strategy Officer, and Chris Giaretta, Connotate’s Vice President of Sales Engineering. We’ll tell you about

  • Fundamentals of the Web data collection (the automation process)
  • Differences in data sources (Deep Web,  password-protected sites, social media)
  • How to compare cost/benefits of manual versus automated approaches
  • Fundamentals of sentiment analysis
  • Real-world use cases of web data collection and analysis

And, of course, more.

RSVP here. We’d love for you to join us.