The UN predicts the world’s population explosion: visualized in Impure

Did you know that Spain shares a  population growth pattern with Chile, Saint Lucia and Vietnam? Did you know that Germany shares a  population growth pattern with Russia, Japan and Serbia?

Yesterday, The Guardian published an interactive visualization developed in Impure on the evolution of the world’s population (hitting 7 billions this month). The result is based on the UN Department of Economical and Social Affairs recently published statistics covering world population since 1950 to 2010 and their projections on every country for the 2100. It shows in an explicit way the dramatic changes on world countries’ population.

The visualization is divided in three main parts.

1. A world atlas shows every country population on three different times: 1950 (blue circles), 2010 (magenta circles) and 2100 (red circles), and you can choose three different intrevals: 1950-2010, 2010-2100 and 1950-2100. If you look at this atlas, you will immediately see one visual pattern, that is huge population growth in Africa for the period 2010 to 2100. On the contrary, Europe’s population growth declines  in that same period. Take a closer look by zoom in any of three geographic areas (Europe, South Pacific and Caribbean Region).

2. You can select any country in the atlas and highlight its population and life expectancy (which is, together with fertility rate, a key variable to understand population growth) in 1950, 2010 and 2100.

3. Finally, from the selected country in the atlas the Impure code applies a Pearson’s correlation coefficient to cluster three countries with similar growth patterns.

Other Impure projects published by The Guardian:

1. Top paid civil servants and quango chiefs: See who gets the most

2. James and Rupert Murdoch at the select committee hearing

3. An animated history of UK aid 1960-2009 mapped

Posted in Examples | Tagged , , | Leave a comment

Impure enhancements: groups

Impure is changing fast. Besides new modules being constantly added to the list (of all 5 types) we also have been adding some new features that help creating better and more complex projects. One of the features that was planned from the very beginning (and was something that a lot of people always told us it was a necessity) is the grouping function: to be able to select a group of modules that perform altogether a specific function and treat them as a single module. Groups are what it’s called patches or abstractions in other visual languages (we’ll continue using  the term ‘group’ instead of ‘patch’ until some other important features are achieved).

This first version of grouping has a dramatic impact in projects organization. Projects with hundreds of modules, and relations crossing the space in all directions, are very hard to manage, though, so far, unavoidable. Now, with groups, it’s much easier to have clean spaces, and it’s easy to identify complex tasks associated to groups. The spaces became more legible and therefore the impact is positive not only for its creator(s) but also for the ones that have to further understand it or modify it (something very likely to happen since it’s easy to share the code of any space).

To create a group yo only need to select some modules and then press ⌘G.

To collapse and expand it just double-click on it. You can name your groups (click on upper textfield).

 

Finally, you can copy (and therefore share) groups. Just click on it (being collapsed or expanded) and press ⌘C.

This is just a first version, future ones will allow to:

- create groups inside groups (and so on)
- edit groups isolated from other modules in the project
- store groups in a library

In the meanwhile, it’s a good a idea to have a .txt file in which you paste the code of groups that may be used in several spaces.

Look how this space (a simple map of the entire Impure platform) looks on its edition environment:

And here is the code of the space (notice the last lines concerning the groups):

0 String http://dl.dropbox.com/u/459523/impure/data_files/impure/platform_tree.txt 431 178
8 String 4 461 458
9 String rectangle 451 488
10 String -1 281 548
11 String rectangle 281 568
13 ColorScale grayscale 411 358
15 Number 3 481 248
3 decodeIdentedTree 351 258
12 colorListFromColorScale 481 358
19 reverseList 531 418
20 getTreeNLevels 401 308
4 ZoomableTreemap 621 68 420 310
5 FileLoader 171 248
6 ReadCompositionProperty 351 558
7 WriteCompositionProperty 611 498
21 Delay 551 258
0 5
5 3
8 7
9 7 1
10 6
11 6 1
6 7 2
13 12
12 19
3 20
20 12 1
19 4 2
3 21
21 4
15 21 1
group 0 3 5|c|loads data & builds Tree
group 13 12 19 20|c|grayscale ColorList for levels
group 21 15|c|3 frames delay
group 8 9 10 11 6 7|c|fullscreen composition

 

Posted in News | Tagged , , | Leave a comment

Voronoi tessellation comparative of Bicing activity

click on image to open project

One of the most common analysis made ​​in urban planning is to determine the area of ​​influence of a series of points, to know for example how many people is served by each school in a city, or manage a network of public transport.

Among all the alternatives to carry out this kind of analysis, we could highlight the use of Voronoi Tessellation, an algorithm consisting on “The partitioning of a plane with points into convex polygons such that each polygon contains exactly one generating point and every point in a given polygon is closer to its generating point than to any other.”

To test this algorithm we have created an example in Impure where we compare the activity of Bicing (the public bike rental service) in Bracelona over two singular days: The day of the final match of the Champions League and the day of the of the 19J demonstration. The intent of this visualization is to identify areas of increased activity and compare patterns of user activity on the network.

Posted in Examples | Leave a comment

introducing Multi-search tools

The main interface between people and the Internet is the small input text field of the Google search engine. Such a big universe of diverse contents -diverse in format, size, language, structure, message, time and place of creation,…- is commonly observed through a single and narrow lens, a cryptic and rigid algorithm with a basic rule: enter a small text and expect a list of websites. Is like being in a galaxy exploration journey only carrying a magnifying glass.

In the discussion about the evolution of Internet the relation of the people and Internet is more relevant than Internet itself; in other words: Internet is how people relates to Internet. So, as Lev Manovich says, it urges a critique of the search paradigm.

Currently, only programmers have a much richer relation with the digital universe, mainly because they can work with APIs, which give them much more control and access to valuable information. One of the main differences is that a programmer using APIs access richly structured data (not just lists of results). He can understand sets of information from the point of view of semantics, syntactics, geography, time and many other dimensions. They can also combine sources; clean, filter and analyze the information; and, finally, create visualizations. But even a good programmer will need a lot of time to do all these complex things.

From this perspective, it’s quite evident that there is a huge necessity of new tools that allow people to interact with Internet in rich ways (I would also say non-linear ways). Impure is a tool aimed to give people easy and fast access to multiple sources and services, and to combine information coming from diverse sources (including your own). It defines a new way of relation between people and Internet.

One of the (immediate) possibilities of Impure, that goes beyond the simple search, is the multi-search, introduced in a previous post. In its simple version it just requires a StringList (a list of texts) and a single API module.

Impure it’s also a tool to create tools. The following project I present here contains some of these multi-tools aimed to interact with internet, using the Google search engine, in a very different way.

click on image to open project

• In the first tool you type a sentence with two variables denoted by the letters X and Y that will be replaced by numbers, generating multiple texts (100) that will be searched. You can try things like “the match ended X-Y”, “X pros and Y cons”, “X than Y”, “I’m X and my boyfriend is Y” (in this last example the first value may be 14).

• The second tool just requires a list of words or small texts separated by comas. The result is a network where each relation between two texts is calculated from searching webpages in internet containing both texts (this metric will be explained in a future tuturial). So, you can quickly build networks from every list of things you figure out! For instance: lists of philosophers, artists, scientists… a lists of classic ingredients for recipes:

sugar, salt, tomatoes, water, flour, eggs, oats, nuts, butter, milk, wheat, soy, cinnamon, honey, bread, garlic, onion, cheese, olive, yogurt

…lists of parts of the body, adjectives, companies, sciences, technologies,  lost characters or cities.

My advice is to start with a short list and, if the result is promising, continue adding elements.

In both tools you can store the obtained information by downloading a file: a csv and a gdf for a table and a network, respectively. I will continue evolving this project, adding more multi-search tools. Also, tutorials explaining how these tools are created with Impure will be coming.

Posted in Examples | Tagged , , , | Leave a comment

words Shakespeare invented

Do you know how many words where coined by Shakespeare in English language? let me give you a hint: a lot, including the word ‘hint’!. Some of these words are very common and frequently used, such as the adjective ‘obscene’ or the verb ‘to film’. Others seem to have been less successful polluting the language, such as ‘deracinate’ (whose root is latin word racine, that is: root), that means without roots or out of his homeland.

The number of words Shakespeare coined is amazingly large, at least 1000 words. I made a compilation of many (560, available below), taking them from many sources such as this, this, this, this, this, this, and this. Some say that these words were invented, others use the most conservative word coined and finally some are more accurate saying that most of these words are loanwords, because at least many of them were ‘borrowed’ from french or latin.  I read some passionate discussions in forums about the use of the word invention for these cases. I kept it in the title (is more provocative) because I think is arguable that there’s no such a thing as a deracinated invention, something that came out of the blue without context, inspiration or reference.

This is the list of the 20 most common words according to its number of Google results:

advertising, 338000000
control, 227000000
design, 188000000
published, 169000000
road, 167000000
manager, 143000000
overview, 125000000
label, 87500000
useful, 83200000
cheap, 81600000
employment, 74800000
traditional, 74700000
secure, 73600000
lower, 70200000
switch, 58500000
successful, 52400000
investment, 47000000
import, 41800000
critical, 41100000
gossip, 35000000

Once I knew about the existence of this words, many of them being now common currency, my first question was if people whose mother language is English (not my case) were aware of this, and if the answer is yes, for which words. I want to coin here a new use for the adjective deracinate: I will say that a word is deracinated if no one knows what its root are, and I also propose a graduation in its use: a word is more or less deracinated according to the number of people using it that knows its roots, its etymology, its origin, its history.

I figure out a method that gives me information about how deracinated a word is. The method is based on the technique of multi-search, previously introduced and commented in this blog. The idea is to measure the number of times a word is used, and also the number of times it is used in a context in which the name of the writer is mentioned. Imagine there is a word that every time it’s used is in a text where the word Shakespeare is also present: we can conclude that this word is not deracinated at all. In the opposite case, a word commonly used that it’s never accompanied by a mention of Shakespeare seems to have a life on its own, far from its inventor, and very few remember its origin. Using the Google engine I made the following searches for each word: word and word + AND Shakespeare. Then I compare both results.

The following image depicts the 590 words on a Scatter plot.

Words in the up-left side are the ones that have a major proportion of occurrences in webpages mentioning Shakespeare. Words in the down-right have a lower proportion of occurrences in websites that mention the writer: they are the most deracinated. Among these words: priceless, to film, hint, obscene, and to undress.

It’s clear how, taking 10-based logarithms, points are distributed linearly. The deracination index is depicted by a temperature color scale (from black for the less deracinated to white for the most). The final index is defined by this formula:

log10[ G(text) + 1 ] / log10[ G(text + AND shakespeare) + 1 ]

This is the lists of the 20 most deracinated words coined by Shakespeare, with their Google search results, the Google search results adding AND Shakespeare, and their deracinated index values:

to bemad, 169, 2, 4.67
to over-red, 132, 2, 4.45
fashionmonger, 5060, 19, 2.84
eyewink, 8070, 23, 2.83
bullyrook, 400, 8, 2.72
fullhearted, 802, 12, 2.6
to canopy, 24700, 60, 2.46
nonregardance, 78, 5, 2.43
coppernose, 1430, 19, 2.42
overcredulous, 484, 12, 2.41
death’s-head, 3430, 30, 2.37
offenseful, 202, 9, 2.3
to uncurl, 4060, 36, 2.3
scrimer, 2120, 27, 2.29
to reverb, 5900, 44, 2.28
ring carrier, 2350, 31, 2.23
over-weathered, 884, 20, 2.22
to subcontract, 33200, 109, 2.21
to bedabble, 34, 4, 2.2
to outswear, 34, 4, 2.2

This is the lists of the 20 less deracinated words coined by Shakespeare, with their Google search results, the Google search results adding “AND Shakespeare”, and their deracinated index values:

to outfrown, 5, 5, 1
to bewhore, 2, 2, 1
to overstink, 4, 4, 1
to outsweeten, 3, 3, 1
bum-bailie, 4, 4, 1
to behowl, 3, 3, 1
to enclog, 11, 10, 1.03
to discandy, 7, 6, 1.06
fubbed off, 59, 41, 1.09
cross-gartered, 1050, 565, 1.09
to outscold, 5, 4, 1.11
flirt-gill, 396, 214, 1.11
to unfool, 21, 15, 1.11
to overbulk, 42, 28, 1.11
scroyle, 624, 311, 1.12
to unsex, 1020, 482, 1.12
semblative, 286, 153, 1.12
drollery, 29700, 9150, 1.12
arch-villain, 32700, 9330, 1.13
rug-headed, 214, 97, 1.17

Here it is the bare list of words.

Here it is the complete list of words with values, and in CSV format.

And this Impure space is the one used to perform all the Google search tasks and build the Table and the CSV. Once the space is opened it starts searching using the InternetMultiNSearchResults API module. It also contains a simple scatter visualization. As usual, Impure users can copy the code of this space in order to modify it and create new stuff. In general terms this structure allows measuring how much an item is related with a set of other items. In this case the first item is Shakespeare and the others are words or texts, but it could be a brand name and a list of adjectives, an adjective and a list of brand names, a politician and a list of corruption criminals, etc… All these are examples of one-to-many multi-searches but it could be of course many-to-many what lead to networks construction, a topic I will attend in further posts.

 

 

 

 

 

 

Posted in Examples | Tagged , , , , , | 2 Comments