Just an idea I thought about last night:

Premise

1.
More and more people are starting to use and work with open data. We have institutions that are working on opening more and more datasets to the public. Which is, of course, great but only to the point when you get to the borders of your own country. Once you need highly specific data from other countries you crash into the language barrier and thats it. Eventhough almost every country in, for example europe, has its own volunteers and people passionate about making it happen.
2. Only service that, mostly government run websites with open data provide is crude datasets in several formats that you have to download and then run locally. With little or no description of what the data set is actually about. That is extremely problematic for people with little technical background who could in reality benefit from working with the data. Opening it to the masses, which should be the ultimate goal anyway.

Proposition

-create a non-government non-profit run platform that would aggregate regonal datasets and translate them to english with help from local volunteers
-in most cases translating the headers only is enough to render the whole dataset understandable
-annotate every dataset (again, crowdsourced) in plain english explaining what kind of data the dataset consists of, what can be done with it, how it can be connected to other similar or relevant datasets
-apart from providing an option to download datasets for local use, store them on a cloud based platform (Azure, AWS, Google cloud platform) for online use. Its cheap, pay as you go and probably even free, considering that it would benefit the provider in the future (business from application build on open data, PR, mission statement consistent)
-provide a simple online interface to work with the data in realtime, similar to kaggle (www.kaggle.com), or teradata console, where people can explore and play around with what interests them, help annotate datasets better, show their work and inspire others
-connecting data from developed countries (US, Europe) with developing countries (SE asia, Africa) can yeald new business opportunities unseen before because of language barriers. This time with the information flow in the opposite direction: Developed -> Developing

Leave a Reply

Your email address will not be published. Required fields are marked *

4 × 1 =