Friday, February 27, 2009

Crowdsourcing Spatial Surveys and Mapping

Below is a paper we will be presenting in March at GISRUK 2009. The full reference is:

Crooks, A. T., Hudson-Smith, A., M., Milton, R., and Batty, M. (2009), Crowdsourcing Spatial Surveys and Mapping, in Fairbairn, D. (ed.), Proceedings of the 17th Geographical Information Systems Research UK Conference, Durham University, England, pp 263-269. (pdf)

We thought we would put it on-line, to gauge peoples thoughts about it as it is the product of the crowd. Any comments and suggestions are most welcome.

Why blog about this work? It demonstrates the potential of crowdsourcing peoples opinions to specific questions over space and time both statistically and geographically, such work potentially allows one to crowdsource peoples perceptions on: fear of household burglary, quality of local schools, who would you vote for? etc. Additionally it is the ability to access real time information and use it for a purpose. For example, with the growth in mobile phones with built in GPS (such as the iPhone) if one had enough participants one could use the data for calibrating pedestrian or traffic simulations and therefore help potentially understand human behavoir. Such as peoples daily movement patterns (see urbanTick for such work).

Crowdsourcing Spatial Surveys and Mapping

1. Introduction

This paper presents the potential of linking the GMap Creator software and the MapTube web service to create near-real time spatial surveys. Three different surveys will be presented which map people’s perceptions about certain questions, including the current financial crisis, anti-social behaviour and peoples thoughts on road pricing. Basic results will be highlighted for each and the geodemographic profiles of respondents will be explored. However, before discussing this, the underlying technologies that we use for the creation of the surveys: GMap Creator and MapTube, will be introduced.

1.1. GMap Creator

GMap Creator is a free piece of software that takes a shapefile and enables the creation of thematic layers which can be quickly and easily integrated into Google Maps in a simple ‘point and click’ manner (see Hudson-Smith et al. (under review) for more details). Using GMap Creator, it is possible to overlay pre-rendered thematic tiles on top of street and satellite views of Google Maps, making it possible to show complex areal coverage’s. The purpose of such a tool is to build feature rich cartographic websites that may easily be used and interpreted by individuals who have limited experience of spatial data handling (e.g. Gibin et al., 2008) rather than for more formal exploratory spatial data analysis.

1.2. MapTube

MapTube ( combines the generic idea of YouTube where users can share information with the ability of GMap Creator to create thematic maps. MapTube provides a ‘place to put maps’ as we demonstrate in Figure 1, which highlights the most viewed maps currently on the MapTube site. MapTube acts as a portal for geographic data, data is not stored on the site. Every map hosted on MapTube is held on an outside server, and pulled in using the XML file which is automatically created when using GMap Creator. This allows data creators to maintain ownership of the data. MapTube allows one to view and compare different datasets as a series of layers (i.e. mashup) through the Google Map interface. However, we are currently working on an implementation for OpenLayers (see Milton, 2008).

Figure 1. MapTube home page showing the most popular maps.

2: Near Real-Time Spatial Surveys

Not only does MapTube allow people to share and view other people’s maps but it can also be used in more innovative ways. For example, as web surveys are often aspatial (e.g., the ability to combine GMap Creator and MapTube offers a simple solution to build spatial surveys for large areas. Figure 2 shows the process of creating the near real-time maps. Users are asked a series of questions and to enter their postcode so that the results can be geo-coded. This is then sent to a web server, time stamped and stored in a database. Every 30 minutes (however, this can be varied) a script is run to create a new shapefile, compiling all the results from a survey, aggregating them into a spatial units (in this case postcode districts). The shapefile is then passed to GMap Creator along with an XML file containing information including: settings for colour thresholds, maximum level of zoom and the field name of the shapefile for which the map is to be created on. GMap Creator runs creates a series of image tiles which updates the map on MapTube which can then be served back over the internet.

Figure 2. The process of gathering, storing and creation of maps.

What follows are three surveys which map people’s perceptions about certain issues done in association various BBC organisations. For each survey no personal information was collected and participants were reassured that actual locations could not be identified. This was ensured through the use of postcode districts rather than the postcode unit or building address therefore preserving data confidentiality. Used in conjunction with MapTube, it allowed participants and other users to take other information and lay the maps on top of one other.

2.1. Mapping the Credit Crunch

A pilot study was carried out as an experiment to create a mood map of the credit crunch within the United Kingdom in conjunction with BBC Radio 4 iPM show . Based on what is the “singly most significant factor hurting the person the most about the credit crunch”, participants were asked to enter the first part of their postcode (postcode district) so their responses could be geo-tagged along with one of six options to choose from: mortgage or rent, fuel, food prices, holidays, other, or the credit crunch is not affecting me.

Between 26th April and 29th June 2008 there were 23475 responses to the survey with 48.8% of response saying that fuel was most significant factor hurting the person the most about the credit crunch (Figure 3). However there was spatial variation around the country with more respondents within Greater London saying it was either mortgage or rent, or food as shown in Figure 4.

Figure 3. Overall percentages for the Credit Crunch Survey.

Figure 4. Results of the Credit Crunch Survey Focused Around London (Note: the Colour represents the Most Frequent Response in the Postcode District).

2.2. Anti-Social Behaviour in East Anglia

The Credit Crunch Map has since led to BBC Look East, using the system to map peoples perceptions of anti-social behaviour.

Anti-Social Behaviour in East Anglia from Andrew Crooks on Vimeo.

Each respondent was asked “what problems do you face where you live?” Respondents had five options: drunken youths, noisy neighbours, boy racers, no problems, great community and no problems. The survey ran between 4th July 2008 and 12th September 2008. During this time 6902 responses were received. Figure 5 shows the overall percentages, with 33.7% saying drunken youths with the other categories broken down relatively evenly between 14 to 18%. Figure 6 maps the responses with drunken youths clustering around urban areas such as Norwich and Newmarket.

Figure 5. Overall Percentages for the Anti-Social Behaviour Survey.

Figure 6. Results of the Anti-Social Behaviour Survey Focused Around East Anglia (Note: the Colour represents the Most Frequent Response in the Postcode District, click here to see the map).

2.3. The Manchester Congestion Charge

There was a proposal for Manchester in introduce a congestion charge zone motorists pay to drive in and out of the city at peak times. The BBC North West Tonight program wanted people's reaction to the proposed Greater Manchester congestion charge, from within the city but also people who drive in from outside the region. As these people don't get a vote but may end up paying the charge (subsequently the people of Manchester said no).

The Manchester Congestion Charge from Andrew Crooks on Vimeo.

People were asked the following question “If a congestion charge is introduced in Greater Manchester, along with significant investment in public transport, will you:” and then asked to select one of the following options: drive and pay the charge, drive at different times, use public transport/motorbike/bicycle, work or shop elsewhere, or I am not affected by these changes. The survey began on 14th October 2008. By the 10th December 2008, there were 14933 responses with 46.8% saying they would work or shop elsewhere (Figure 7). This online collaboration provided a unique picture of how well the proposal was going down across the north west of England as the map is updated every day (Click here to see the final map).

Figure 7. Overall percentages for the Manchester Congestion Survey.

3. Geodemographic Profiles of Respondents

While we only asked for respondents or their first part of their postcode, many entered their full postcode as can be seen in Table 1. We note that this in not a representative sample but it does provide an opportunity to further investigate who is responding to such surveys. To gain this understanding we use two geodemographic classification schemes. First, the Acorn classification from CACI which categorises neighbourhoods based on multidimensional socio-demographic attributes. The second being the e-Society geodemographic classification (Longley et al., 2008) which categorizes neighbourhoods based on their engagement with new information communication technologies.

For the analysis, index scores was calculated. An index score compares the over or under representation of a specific target variable against a base population (e.g. the national average). Where a score of 100 is the national average, 200 is double the national average and a score of 50 is 50% below the national average. From such analysis it is the middle and upper classes who are over-represented within the surveys as shown in Table 2, this potentially relates to demographics of the readers, listeners, and viewers Radio 4 and the BBC news. The over representation of E-business users in the E-society classification (Table 3) suggest many respondents are answering the questionnaire while at work. Furthermore the geodemographic profiles of responses to individual questions can also be explored as seen in Table 4. Across all demographic groups the biggest concern was fuel.

Table 1. Total Number of Respondents to Surveys and Number Who Entered Their Full Postcode.

Table 2. Index Scores of Respondents by Acorn Category Classification.

Table 3. Index Scores of Respondents by E-Society Group Classification.

Table 4. Percentage of Responses to the Credit Crunch Survey Broken Down by Acorn Category.

4. Discussion

This paper has demonstrated the potential of using GMap Creator and MapTube for near-real time spatial survey thus providing a resource to map the nations opinions to specific questions over space and time both statistically and geographically. The potential of this approach for gathering spatial information is enormous. For example, it could easily be used to gather other information such as fear of household burglary, the quality of primary school education and so on. We consider this in many senses this to be Web 2.0 and Neogeography in action.

However, the geodemographics of the respondents shows there is an inherit bias in who is answering the questions and there is the question to whether or not respondents are influenced by the maps before answering the questions. Further work is to explore how the maps evolve over time, as each response is time stamped and how this relates to news headlines. Additionally, we are currently exploring the geodemographic profiles of each survey in more detail. We have currently re-run the credit crunch with the BBC with slightly different options to the answer.

The question remains the same - "what single factor is hurting you most about the credit crunch?" But we decided to change the categories slightly:Mortgage or rent, Petrol, Food prices, Job security, Utility bills, or Not affected. This survey ran between 5th October 2008 and 3 February 2009 and has now closed. The final map can be viewed here. During this time we received 20,072 responses, which can be broken down as follows (Figure 8): Mortgage or Rent 11.05%, Petrol 4.7%, Food Prices 11.89%, Job Security 27.25%, Utility Bills 21.92%, and Not Affected 23.20%

The Return of the Credit Crunch on the BBC Site

Figure 8: Overall percentages for the Credit Crunch Survey

5. References

Gibin M, Singleton AD, Mateos P, and Longley PA. (2008) Exploratory cartographic visualisation of London using the Google Maps API, Applied Spatial Analysis and Policy 1(2) pp85-97.

Hudson-Smith A, Crooks AT, Gibin M, Milton R, and Batty M (under review) Neogeography and Web 2.0: Concepts, Tools and Applications, Journal of Location Based Services.

Longley PA, Webber R, Li C, (2008) The UK geography of the e-society: a national classification Environment and Planning A 40(2) pp362-382.

Milton R (2008) GMap Creator, OpenLayers and OpenStreetMap CASA Blog. Available at .

Thursday, February 26, 2009

Agent-based models for Venice

While reading the Venice 2.0 blog by Fabio Carrera which explores research projects in Venice, Italy, I came across two interesting agent-based models, the first is simulating Venetian boat traffic for city officials (click here to see a movie). The second model is the the evacuation of Saint Mark's square both models were co-created with the Redfish Group (Click here to see an earlier post on the work of Redfish).

Further information about the work can be seen on the SantVe site.

Simulating boat traffic in Venice (using NetLogo, Source SantaVe)

Simulation of pedestrian traffic through a square in Venice by Fabio Carrera. Source Venice 2.0.

Wednesday, February 25, 2009

Urban and Land Cover Modeling: The Gigalopolis Project

While not new, I thought its about time I did a blog post on the SLEUTH model. What is it? The SLEUTH model is a tightly coupled, modified cellular automaton model of urban growth (and other land class change) which has been applied to over 100 cities and regions over the last decade (Click here to see some of them). For example, it has been used to explore future land use patterns under different policy scenarios in the Washington-Baltimore metropolitan area. The name SLEUTH was derived from the simple image input requirements of the models: Slope, Land cover, Exclusion, Urbanization, Transportation, and Hillshade.

The project was released under the name Gigalopolis, (a growing urban structure containing billions of people worldwide), a collaboration between the US Geological Survey and the Department of Geography, at UC Santa Barbara, specifically Keith Clarke. I recomned checking out the site, not only does it contain the SLEUTH model (writen in C) which can be downloaded, along with a sample data set but also there is information about the background of the project (including publications), what type of data does the model require, along with application areas. Furthermore the site provides useful links to information on general CA, as well as other CA land cover models.

This image above shows the forecast of development patterns in 2030 for an ecologically sustainable scenario in the Washington-Baltimore metropolitan area (click here for more information and the source of the image).

Tuesday, February 24, 2009

Urban Informatics

Cities provide habitats for over half the worlds population and this is expected to increase to 70% by 2050 (United Nations, 2007). Coinciding with this growth is the increasing use of information communication technologies (ICTs) such as mobile phones, wireless networks, Global Positioning Systems (GPS), and web based services in our personal and professional activities. Such technology is changing our daily lives, not only in the way we communicate and interact with each other or share information but also how we view the city.

Urban informatics research has evolved to explore this change. It is concerned with the impact of technology, systems and infrastructure on people in urban the environment. It draws researchers together from a broad spectrum of academic communities. For example, the social (media studies, cultural studies etc.) the urban (urban studies, urban planning, etc.) and the technical (computer science, software design human-computer interaction etc.) and focuses attention on the opportunities and problems of such ubiquitous computing.

Marcus Foth in this edited book entitled “Handbook of Research on Urban Informatics: The Practice and Promise of the Real-Time City” brings together in 29 chapters on recent research and development in the field of urban informatics from around the world. The book covers a plethora of topics including; community engagement, digital cities, digital identities, locative media, mobile and wireless applications, participatory planning, personal privacy, surveillance and sustainability.

The book is spilt into six parts, the first part introduces urban informatics and highlights it diverse application areas, including how people are adapting to such digital technologies and how they are impacting on urban dynamics. Section two explores who is participating and how ICTs are being used. Questions addressed include why people join online communities, its use for community planning, how such technologies can trigger or sustain civic participation or how ICTs can be used in public spaces for collective expression through large screen interactive projections.

The third section explores how one can engage urban communities through the use of ICTs and enable the communication of information and interaction between people in the city. For example, location based media accessed through GPS enabled devices. Section four discusses how ICTs can impact on location, navigation and space such as studying online social networks along-side their real world counterparts. How augmented maps can aid car navigation or how virtual cities can be used as a test-bed for examining the design of urban public spaces through the use of agent-based modelling. Section five discusses the development of wireless and the mobile cultures and how such technologies impact on our daily lives and activities both at home and in the city including the development of community wireless networks and how mobile phone use in urban cultures are reducing the need for premade plans.

The final section explores how ubiquitous and pervasive computing might develop and be used in the future, not only how it can advance urban functions but also its potential problems. For example, the increasing deployment of sensors and electronic devices within the transport network will allow for real time access to route planning thus avoiding congested areas. For the citizen of a city, the development of ICTs will encourage greater mobility for or be used to coordinate social action such as protests (see Justo, 2004). ICTs could also be used as a tool for collective problem solving but such technology will also challenge existing social structures and potentially distance people from more traditional face to face contact.

In summary, the book pulls together a diverse literature of ICTs, how they are used throughout the world. The chapters range from critical reflections on who is using the technology and how, to more technical applications. The chapters are mostly well written and referenced. Each chapter has a key terms section which is extremely useful in understanding each chapter. The chapters are mostly well written. My only criticisms of the book are the quality of the images (they are all black and white) while some of the text refers to them in colour and the cost at $265. Nevertheless, it does provide a useful resource for social and computer scientists.

In relation to my own research, I am interested in how such technologies can be utilised for the creation of agent-based models and thus providing a greater understanding of cities and their inhabitants.


Justo, P D, 2004, Protests Powered by Cellphone, The New York Times (September 9th).

United Nations, 2007, World Urbanization Prospects: The 2007 Revision, Department of Economic and Social Affairs: Population Division, New York, NY.

Monday, February 23, 2009

Work update

Sorry for the lack of posts recently, we have been working on a number of projects. One is creating a simplified road network for London to explore the road structure in relation to network theory (see Masucci) but also for its use in accessibility measures. Our second project is building a detailed land use database for London (using SQL Server which we access through ArcSDE).

The purpose of the database is so our research group can use it for various applications (such as land use modelling, residential agent-based modelling, urban sprawl analysis, sustainability, rain water harvesting etc). The aim of the land use database is to tag all the buildings within London with various attributes such as use, whether it is a house, a flat or an office etc. The data sets we are using include; Ordnance Survey MasterMap and Address layer 2 , building heights via LIDAR data from InfoTerra. We are using Cities Revealed data for residential building types and age along with several other datasets. When combined it will allow for fine scale and extensive modelling of the of London’s housing market & built environment.

Below are some preliminary outputs, including a land use visualisation of the Isle of Dogs, the London Borough of Tower Hamlets broken down by residential property types and finally residential density within a section of the Isle of Dogs.

Isle of Dogs Land-use 3D Visualisation (Red is Residential, Dark blue is Office, Light blue is Office Mixed Use).

Housing Classification of Tower Hamlets, London (yellow is terraced housing, blue is flats and grey is non residential).

Residential Density within the Isle of Dogs (Dwellings per Hectare)

Agent-based models for Latin American Cities

Latin American cities are characterised by high speeds of urban growth and the development of spontaneous settlements. Such change has produced many problems especially in relation to urbanisation and housing.

The other day I came across the work of Guilherme Ressul who maintains the tzero site. One of his projects is to model the growth and development of Favelas in Brazil. For the project he has created an agent-based model using Maya to explore the growth of such developments. If you are interested in such 3D visualisation and agent-based modelling it is worth exploring his Favela micro-site, which includes movies, scripts and further information about the project.

Another interesting agent-based model specifically developed for the study of Latin American cities, in relation to urban growth is that by Joana Barros. Her model focuses on a specific kind of urban growth that happens in Latin American cities, called “peripherisation”. This is characterised by the formation of low income residential areas in the peripheral ring of the city.

More information on Joana's work can be found here.

Sunday, February 22, 2009

Virtual cities for simulating movment in public spaces

While reading Urban Informatics (Foth, 2009) I came across an interesting article by Nakanishi et al. (2009) entitled “Virtual Cities for Simulating Smart Urban Public Spaces” which explores the use of virtual cities as a test bed for examining the design of urban public spaces. Specifically the authors combine an agent-based model with a virtual city model (in this case a platform at the Kyoto subway station) and used augmented reality to allow humans to interact with the agents (as if the agents and the humans where in the same crowd) through the use of positioning sensors around the station and then simulated an emergency.

Further information about the project, including the software Freewalk where humans and agents can socially interact with one another in a virtual city space can be found here.


Foth, M. (2009), Handbook of Research on Urban Informatics: The Practice and Promise of the Real-Time City, IGI Global, Hershey, PA.

Nakanishi, H., Ishida, T. and Koizumi, S. (2009), 'Virtual Cities for Simulating Smart Urban Public Spaces', in Foth, M. (ed.) Handbook of Research on Urban Informatics: The Practice and Promise of the Real-Time City, IGI Global, Hershey, PA, pp. 257-269.

Friday, February 13, 2009

Moving thoughout the City

For the last week and probably for a few more, I have been walking around with a GPS (Garmin Foretrex 201) device strapped to my wrist. Why? It is part of an experiment by a PhD student at CASA called Fabian Neuhaus who writes an excellent blog called Urban Tick. Fabian has given several devices to volunteers to track them around the city. The intention is to collect information about the spacial extend of everyday routines. The movie below is what 1 weeks action looks like on top of Google Earth.

UDp_090212_GoEa from urbanTick created by Fabian Neuhaus.

More information can be seen on Fabians blog: Urban Tick.

Computational Social Science

I came across an interesting idea from the SimSoc mailing list sent by Professor Nigel Gilbert, about crowdsourcing a book on computational social science. See below for the email message:

Sage will be publishing a four volume set on Computational Social Science in Spring 2010. This four volume set will reprint the key articles in the emerging field of computational social science. It will include:
· the hard-to-find classic papers that first signalled the potential of the computational approach;
· a selection of influential examples of computational social science from a wide range of social science disciplines, including economics, sociology, geography, political science, social psychology, anthropology and archaeology, and business and management; and
· contributions on the methodology of computational social science, including comparisons with other approaches.
Computational social science is here defined as the use of computational models (so including all forms of simulation, but not, for example, equation-based models).

The set will include approximately 80 articles. The great majority will be either articles originally published in academic journals, or as chapters from edited collections derived from conferences. They will be divided into sections, each with a brief introduction.

You can help with selecting the 80 articles (no complete books or lengthy reports). The articles proposed so far can be found in a CiteULike group at:

You can add your suggestions, and comment on the items already proposed. Everyone who contributes or comments has a chance of winning a free copy of the four volume set (worth about £500, $1000, €750). The winner will be selected at random before the date of publication from those who have participated.

This is quite an interesting experiment and if nothing else the page should provide a useful reference for those interested computational social science. For example, the often overlooked Sakoda (1971) Checkerboard Model of Social Interaction is mentioned and many more articles which I have found useful in understanding and exploring computational social science (CSS).

Monday, February 09, 2009

N-Person Prisoner's Dilemma: A Spatial Application

In the current issue of JASSS there is an article by Conrad Power entitled "A Spatial Agent-Based Model of N-Person Prisoner's Dilemma Cooperation in a Socio-Geographic Community" where he presents a spatial agent-based model on the N-person prisoner's dilemma (NPPD). The NPPD is a social dilemma game which is focused on the simulation of the collective actions and behaviours within social groups.

The purpose of the model is to present a spatial agent-based approach for modelling the processes of communication and cooperation within a socio-geographic community. The model itself is written in Java and utilizes RepastJ, OpenMap and JTS to simulate agent interactions, movements, and the NPPD game play to the town of Catalina, Newfoundland and Labrador, Canada.

The full article can be found on the JASSS site.