Hack Days get people engaged with and excited about a subject, and our subject is history. There's a lot of it about, but lots of it is stuck, on paper, in old HTML pages, in archives, on hard-to-reach databases. We want people to figure out how to set some of that free, and start bringing that history to life as... whatever crazy things people can come up with.

We want people to come up with hacks that make history accessible and understandable in new ways

Think Scrapheap Challenge meets In Our Time versus Time Team in bed with A history of the world in 100 objects.

We also want to get as many smart people interested in history into a room as we can: there are a whole load of hard problems we might be able to come up with some good ideas about solving (How should digital maps deal with changing landscapes? How might we join up some of those unfortunately separated data sets? How should government store this stuff going forward?)

There'll be updates and thoughts here on the blog, and you can sign up and join in over at the History Hack Day Wiki

Yahoo! Open Hack Europe

There's going to be some History Hacking going on at Yahoo!'s Open Hack Europe in Bucharest over the weekend of 14-15 May.

I'll be attending and giving a little History Hacking pep talk before spending the weekend hanging around with our secret* team of History Hacking specialists to help people with idea and data wrangling.

If you're going to be there then you can head over to the History Hacks page on the Open Hack wiki.

* All will be revealed once we've all found our passports.

Expect more updates as the weekend approaches...

The hacks!

It gives me great pleasure to present these, the Compleat Hacks from History Hack Day 2011. I've given URLs where I have them, and I'll happily add links to the hacks (and links to blog posts &c. about them) if people have links that I don't (very likely). Once I have the video edited I'll try and link to the presentations.

Best hack

Cristiano Betta's A mobile history of the world in 100 objects

http://hotw.cgb.im

Allowing a listener of the podcast to be guided through the British Museum and explore the objects while listening to the audio podcast

Best visualisation

Winner: Gareth Lloyd & Tom Martin's A history of the world in 100 seconds

We present a video encompassing all of world history in 100s

Simon Harriyott's GeStation

http://harriyott.com/geStation/

The birth of UK railway stations

Most unexpected hack

Winner: Brian Suda's Titanic Ticket Match-up

+1 (804) 316-9215 (US), +44 2035142721 (UK), or +990009369991481398 (Skype)

Phone up and answer a couple of questions to see if you'd make it off the Titanic or not

Tom Scott's Magical Mystery Ley Line Locator

http://www.tomscott.com/ley/

Ley Lines: mysterious lines of force between ancient monuments. Are you one of the lucky Britons that lives on one of these mystical energy highways?

Best game

Winner: Mike Stenhouse's Wikipedia Top Trumps

An overly-ambitious attempt to turn Wikipedia into a deck of Pokemon cards

Simon Cross & Seyi Ogunyemi's Plaqathon

http://plaqathon.appspot.com/mobile

Find Blue Plaques near where you are, checkin to them. See which Plaques your friends have checked into, which are the most popular Plaques and who are the most prolific Plaque collectors!

Best public history

Winner: Tom Morris's Wiki Loves Geo

http://blog.tommorris.org/post/2988268065/wikilovesgeo-we-have-explosive

Helps you find Wikipedia articles tied to a location that need photographs of that location.

Crystal & Steven's Victorian photographers

Discover the Victorian age through Victorian photographs and find out more about the photographers. Details like where the photographers lived, when they lived, the photographs we have for each photographer in our collection.

Honourable mention

Jed & Paul Downey's Tiddly Trains

http://railways-preso.tiddlyspace.com  http://whatfettle.com/2011/01/TiddlyTrains/

A tentative exploration of the state of UK preserved railways

And not forgetting...

Morena & Chris's Price Re-enactment Adjustment Tool (PRAT)

http://plaqathon.appspot.com/mobile

Using the Open Plaque data and 1940's blitz data we cross reference the current house prices and adjust them relatively considering what events happen is in the near vicinity.  The idea was to make some sort of game out of this so that people pick a house and we can tell them how much money they may make if a celebrity lived close or (in 1940's) a bomb dropped near their house.  We change the factors based on type of bomb and what sort of damage it made.

Kornel LesiƄski & Daniel Knell's Unix-time Top Trumps

http://hh.geekhood.net:8003/

Web-appy remake of a classic game, showing off specs of computers since the beginning of time — 1st Jan 1970

Patrick Sinclair's Most Ancientest Thing Ever

http://mostancientestthingever.heroku.com/

A crowd-sourced history of the world

Jeremy Keith's London on a Stick

http://icanhaz.com/londononastick

Pictures from the National Archives on USB sticks to be dead-dropped at the locations of the pictures.

Jonty Wareing's Local history

http://jonty.co.uk/bits/localhistory

A tiny mobile site that tinds interesting historical information about your currently location in London, currently using the Museum of London archaeological records. Also an API that allows the Museum of London to be queried geographically.

Imran Ghory's Body parts in History

http://www.imranghory.org/books.html

Shows the popularity of body parts in books over the last 500 years

Richard & Louise Boulton's Historical explorations in Nagaland

http://hh.geekhood.net:8003/

Displays details of several expeditions into the remote north of india, from 1870 to 1970

Paul Rissen, Jonathan Tweed, & Luke Blaney's Around This Time aka 'What's going on?'

Take one earth shattering event (from the BBC's On This Day data), and see what was happening on TV/Radio at that time. Or, name your favourite TV show, and see what was happening in the world when it was first broadcast.

Ben Griffith's Lost Bomber

http://www.techbelly.com/2011/01/23/lost-bomber/

An account of a very ordinary death.

A brief recap of the event; Some thanks and acknowledgements

Well, this has taken me a while. It's been a crazy three weeks since History Hack Day, and I thought it was high time I actually started talking about how well it went and what fantastic stuff people made.

I thought that it went very well indeed. We had a crowd of about 70 hackers of all stripes, and 20 hacks were presented at the end of the weekend. 

And, what hacks they were! We had everything from a museum guide (Ford Prefect style) to dial-a-Titanic. The hacks were many, varied, and impressive. I'll be covering them in more depth in another blog post (I have notes and everything), and we have video of the presentations (which is sitting in iMovie and taunting me). For now, I'd mostly like to say thanks to all the attendees, the venue, our volunteers, our judges, our providers-of-prizes, and our sponsors:

Hackers

Thanks to everyone who came along and joined in the making: these things are, at the end of the day, about what you make. And you made stuff that was awesome, and created a wonderful atmosphere while you did it. 

Volunteers

Thanks to Simon Harriyot, Simon Bone, and Ben Smithfor stepping up and helping with lugging supplies and registering everyone. 

Thanks to Nneoma Amadi-obi from GMU's History News Network who came along to report on what we got up to, and who also helped out with registration.

Thanks to Simon Hildrew for the loan of video camera and tripod, Thanks to Luke Blaney for operating the camera during the presentations.

Thanks to Steve Lawson from Amplified for running around and recording conversations with people over the weekend. 

Massive thanks go to my wife Clare for sorting out the food, drink, and the logistics. Without all her hard work the event simply wouldn't have happened.

Speakers

Thanks to our lovely speakers: Max Gadney, Francisco Jordano, Mano MarksDan Pett

Judges

Thanks to everyone who helped me judge the hacks. (It was hard because the hacks were so good.): Katie Ellen, Mano Marks, Dan Pett, Tom Pollard, Jo Pugh, Murray Rowan.

Providers of Prizes

Thanks to the Ordnance Surveythe British Library, the British Museum, O'Reilly, and the Science Museum for providing a fantastic array of prizes. Thanks to the Crafts Council for providing those lovely wooden USB sticks and Oyster card wallets for everyone.

Sponsors

A huge thanks to our Sponsors:

Without the sponsors, there would have been no History Hack Day.

The venue

A massive thanks to The Guardian, and especially to Alex and Emily from The Guardian's events team who gave up chunks of their weekend to help things run smoothly.

So, what happened then?

There'll be more stuff on this blog once I've edited the video, pulled my notes and the hack list together. For now, here's a small sample of what other people have been saying about it:

There are more that I've missed. (Sorry.)

There are lots of lovely photos over on Flickr, and Steve Lawson's Audioboos with people over the weekend can be found on the Amplified event page.

 

Last minute administrivia

Possibly important things:

  1. If you want to play Werewolf on Saturday evening, someone should bring a deck of Werewolf cards.
  2. If you want to play other games on Saturday evening, you should bring other games...

Actually important things:

  1. Doors open at 10.30, so arriving earlier means hanging around outside, I'm afraid.
  2. The talks will start at 11.30, so arrive in time to register, get a cup of coffee, and get a seat.

Really important things:

  1. You'll have the use of a projector during the hack presentations, it's a VGA projector, so pack adapters if you need them.

 

Help! (a call for volunteers)

With the weekend fast approaching it's clear that we're going to need a hand (or two) to keep everything rolling.

We need three or four volunteers to help registration run smoothly, help make the vast quantities coffee we'll need to start us off (we'll have suitable equipment), and generally stop us losing our minds.

If you're coming, don't mind arriving early on Saturday, and don't mind doing a bit of work (which shouldn't interfere with hacking time) then please get in touch – email matt@reprocessed.org, or ping @fidothe or@historyhackday on twitter.

*Insert Kitchener poster here*

Portable Antiquities

This is a guest post from Dan Pett of the Portable Antiquities Scheme

About 3 months ago, Matt contacted the Portable Antiquities Scheme via Twitter, to ask if we wanted to be involved in his History Hack Day event that is now imminent. Our project, which is funded via the DCMS, is based at the British Museum and has staff members based around England and Wales and records public archaeological discovery. This project has been running since 1997 and Nationally (covering England and Wales) since 2003, with our entire dataset available online from April 2003. Many of you will have come across our work due to the sensational discoveries that have made the media over the last few years - for example the amazing Anglo-Saxon Staffordshire Hoard, the Crosby Garrett Helmet and the Frome Hoard. All of these objects were found by metal detector users, and around 70% of our data comes from the hobby sector. All of these data are collected and published under a Creative Commons Non Commercial By Attribution Share-Alike licence (we have over 60 partners, so getting them to agree to this was quite a feat!), on our website.

Cheek piece, fittings and zoomorphic mountStaffordshire hoard

Victor Ambrus and Dave CrispFrome hoard

Close upCrosby Garrett helmet

In March 2010, a new website was launched (only costing £48 to rebuild and the cost of two new servers), with all our web resources consolidated in one place at with these data made available in a wide variety of formats for consumption by a diverse (but very niche!) audience. Our aim is to attempt to record as many of the archaeological objects found annually as we physically can (in the last few months we have started to crowdsource records via public data entry), with geo referenced find spots that can be made available at high resolution to academics for research. At the time of writing, we have c.675,000 objects from 18,500 contributors, available for academic use, with around 500,000 objects available to the public. The nature of our records and people's privacy means that we do have some issues with releasing full resolution data online. For example, some sites are extremely sensitive in archaeological terms; for others the landowner requests we do not publish the location as to where the objects were found; our workflow process removes some records from public view as they are usually unfinished. At the moment, our data is also undergoing a huge geospatial audit to eradicate mistakes in find locations. Some of our findspots appear to be in the sea or in the wrong county; as with all databases there's always some errors apparent.

Our website and database (which is still very much a beta iteration) is built on the latest Zend Framework (php) and uses Linux and MySQL to power the services with a large sprinkling of api use and a liberal dosage of YQL to enhance enhance and supplement the data that public voluntarily reports and records online. It has been built entirely by the Scheme's ICT Adviser (Daniel Pett - @portableant). At the moment we use apis from:

  • Flickr
  • Geoplanet
  • Amazon web services (book search and also S3 for backup and storage)
  • Akismet
  • Twitter
  • Wordpress XML-RPC
  • Google maps - using layers from OpenStreetMap and the National Library of Scotland
  • Google analytics for determining most visited content etc
  • Gravatar
  • Delicious
  • Open Calais
  • The British Museum's Collections Online Opensearch module
  • ReCaptcha
  • They Work for You (Parliamentary data)
  • The Guardian's open platform for retrieving articles related to our work
  • dbPedia
  • Pleiades
  • Facebook graph

Our website is heavily reliant on YQL (probably a bad thing in some ways, and if you are going to use YQL on a high traffic site, consider using the oauth endpoint) for querying these apis to produce data for consumption and redisplay. Using YQL, we have been able to pull in all our images from our flickr feed and redisplay them on our website, retrieve geodata from Yahoo! Geoplanet in the form of WOEIDs and the subsidiary information that they hold (a good standalone php package without the need for YQL has been created by Tyler Bell and can be accessed on github) and find objects within an MP's constituency - for example David Laws which you can also get as KML or JSON - using data from theyworkforyou. Other webservices that have proved to be extremely useful included dbPedia; we have guides for people to learn more about coins in different periods and these have been embellished with details from dbPedia (abstract, date of birth etc) using a SPARQL query and cleaning the response up for redisplay. So for example a data base entry for Cnut the Great. Cnut is famous for trying to show that his powers extended over the sea (Henry of Huntingdon chronicled this tale) when he commanded the tide to stop. Needless to say, it didn't.

Above, it has just been demonstrated how 3rd party services have been integrated to make a website that details historical and archaeological discovery and enriched the content. However, this site can also be leveraged for the HistoryHackDay challenge. We have made available a snapshot download of our database in CSV format, which supplements the data that you can retrieve from the online site (context switched views allows for XML, KML,rss,atom and json data responses and a full api is nearly complete).

For example a search for Gold from Bedfordshire can be represented as a html response at: http://www.finds.org.uk/database/search/results/material/23/county/BEDFORDSHIRE/ and as a json response by adding the format/json parameter as shown here: http://www.finds.org.uk/database/search/results/material/23/county/BEDFORDSHIRE/format/json or as KML at http://www.finds.org.uk/database/search/results/material/23/county/BEDFORDSHIRE/format/kml (At present the RSS/ATOM feeds don't contain image data, but this might change by this weekend and some JSON feeds are being tweaked as you read this.) The XML response is compliant with the MIDAS specification that was developed by the Heritage Standards group and maps to the CIDOC-CRM (if you like that sort of thing....)

Simply, any search returned from the MySQL powered search engine can return data in formats that should be able to help you produce something interesting. If you are interested in datamining our database, then OAI-PMH might be of interest to you and this can be reached via our target; instruction at http://finds.org.uk/database/oai with various metadata schemas. Perhaps you could use Omeka and their harvester plugin, to create a site of objects from various collections for a region of England.

The data snapshots that are available include (data relating to county of discovery, object type, numismatic details (coins) etc). These should provide a good point for doing some visualisation work, but be aware that there is html in several fields:

a) a set with 1km grid references - c. 122,000 records approx - due to the privacy concerns mentioned above, the full data cannot be released and so 300,000 findspots cannot be made available. These can be manipulated for lat/lon pairs and Yahoo or geonames lookups quite easily (this has been done for the full resolution references already, but it challenges you....)

b) a second set with county level data - c. 382,000 records (not all of these have findspots)

There are a variety of other museum and university based resources that you can access to help with hacks that you might build. For example, the collaborative Pleiades project, which is currently NEH funded, has data for 31,559 ancient places, 26,060 ancient names and 31,527 ancient locations. Great museum apis can be accessed via a variety of YQL tables, for example the Victoria and Albert (test search for object ID 012345), Brooklyn Museum (needs an api key),Digital NZ (api key needed), Black Country History Museums(test search for painting), Museum of London (test search for Roman sites), and the British Museum (test search for Egypt). And then of course there is the Culture Grid api, others will cover that in more detail.

A very simple example hack shows a search for Egypt across a variety of institutions using YQL for multiple queries of disparate resources. Another great example (the mashificator) of querying various museum resources was created by Jeremy Ottevanger from the Imperial War Museum, but this doesn't use YQL and is driven by Jeremy's magic.

If you want to talk more about using our data for any of your hacks this weekend, get in touch via twitter at @portableant or find Dan on the day.

Writing things down (an announcement)

Given the subject matter, it seems appropriate that there be someone at History Hack Day whose job it is to document the event. That person is going to be Steve Lawson, from Amplified (and elsewhere). Steve's going to be live blogging presentations, asking people what they're building, taking pictures and generally writing things down and putting them where other people can find them.

If you're not able to make the event, Amplified will provide another point of aggregation that will give you a flavour of the event. If you are at the event, then having someone who's there to observe, record, and act as a link between attendees and everyone looking in will hopefully mean that you'll be able to get a sense of what was happening outside your own project and outside the presentations. (Remember Carolina Ödman's SciLapse from Science Hack Day?)

On the subject of writing things down and aggregating them, our hashtag will be #hhd11.

We're also hoping to be able to record the talks and presentations themselves. More on that later...

Less than a week to go!

There's less than a week to go before we throw open the doors on History Hack Day. If you're coming, hurrah! I have some administrivia coming up for you shortly :-) If you're thinking of coming, I have some news which may help convince you (one way or the other...

Administrivia:

  • The event is overnight on Saturday 22 January
  • Attendees can sleep over in the venue (you don't have to, and you'll need a sleeping bag if you do)
  • We will be providing food on both days. (You won't go hungry on our watch.)

News:

  • Signup will close at midnight on Wednesday 19 January. We need solid numbers to place food and drink orders

If you want to come (you do!), please sign up before Thursday.

Sponsor History Hack Day!

If you've been looking at the Wiki then you'll have seen that we're filling up fast with people; datasets, APIs, and ideas are being added all the time. I'm beginning to get really excited about what it's going to be like when we're all there. It's going to be wondrous.

There's only one fly in the ointment, and that's money. Putting on a free-to-attend event costs money, and we're not there yet. However, my expired insect is your glittering opportunity, Company With a Product or Service who Want to Connect With Britain's Brightest and Best Developers! You, dear Company, can sponsor History Hack Day and do just that!

Why don't you get in touch with me? matt@reprocessed.org or +44 (0)20 7193 4195

All kinds of history

There are a whole set of problems when it comes to working with historical data, and the first hurdle is actually getting something which was written down, or printed, on paper into a machine-readable form in the first place. It’s a pretty big hurdle. Getting the archive of Harper’s Magazine online took Paul Ford eighteen months, and that was a phenomenal work rate: collation, scanning, OCR and tying it all into the archive's index. The Harper’s archive is not small (150+ years is no mean feat), but 160 years of a monthly magazine (~150 pages a month) is dwarfed by the sheer volume of the logbooks of the Royal Navy, where there’s at least one page per ship per day (and there were over 900 ships of the line in the 19th century). For an idea of scale, 160 years of Harper’s is around 300,000 pages. One year of logbooks for a fleet of 900 ships is around 325,000.

Clearly, if you wanted to digitise and present the Royal Navy’s logbooks in the same way as Harper’s did you’d need an army of archivists to do it in eighteen months. Not only that, but a ship’s logbook and a magazine are very different things: A logbook is a terse list of things that happened, and a magazine is a collection of, by and large, long-form prose. That means there's a big difference between what kind of data you need to get out: plain text or full-page scans, with rich associated metadata, for a magazine, and structured or semi-structured data from a logbook. 

The latest collaboration between The National Maritime Museum and Zooniverse, Old weather is an answer to the question of how to deal with turning that mountain of paper from the logbooks into data. The Royal Navy’s logbooks contain six weather and position readings each day for each ship, providing a global climate record that has, until now, been inaccessible to scientists. They also contain a record of everything that happened on board ship: battles, accidents, and the names of those involved.

With Zooniverse’s previous project Galaxy Zoo the problem was that there were thousands of pictures of galaxies which needed to be roughly classified, which is easy if you’re a human and really hard if you’re a computer. Their solution was to ask interested members of the public to classify some of the pictures, wait until several people had classified each one, using that to weed out errors. It turns out that classifying pictures of galaxies is fun, in small doses. As they themselves say:

The original Galaxy Zoo was launched in July 2007, with a data set made up of a million galaxies imaged with the robotic telescope of the Sloan Digital Sky Survey. With so many galaxies, the team thought that it might take at least two years for visitors to the site to work through them all. Within 24 hours of launch, the site was receiving 70,000 classifications an hour, and more than 50 million classifications were received by the project during its first year, from almost 150,000 people.

Having multiple classifications of the same object is important, as it allows us to assess how reliable each one is. For some projects, we may only need a few thousand galaxies but want to be sure they’re all spirals. No problem - just use those that 100% of classifiers agree on. For other projects we might want larger numbers of galaxies, so might use those that a majority say are spiral.

With Old Weather, they’re applying the same basic approach to digitising the logbooks, starting with about 250,000 pages of them, from around the time of WWI. The National Archives have scanned copies of the log books, with enough metadata to tell what ship a log book page is from. If you take those scans and build a user interface which allows people to transcribe the data they contain, and which gives people enough feedback, then they can get a sense of the project’s progress, and what’s happening to each ship, and that starts to make it fun.

I think this is a really important project for a few reasons. Zooniverse have been instrumental in establishing Citizen Science as a viable proposition: new cosmic phenomena have been discovered through Galaxy Zoo, papers have been written and published in science journals. Old Weather suggests an approach to historical data which could be just as important. Climate scientists get access to an invaluable record, and historians will be able to get easy access to a whole host of primary source material that would previously have required them to travel and pore over the originals. There’s so much fascinating information locked away in pre-digital records that are in the public domain. Old Weather could become a design pattern for other digitisation/transcription focussed projects.

Old weather is an enabler. If I help transcribe documents for your research project, then one of the rewards is that I gain access to the whole set of transcriptions — the work of the whole community — which I can use for my own research. We all have a stake in getting the logbooks accurately transcribed, and everyone who contributes benefits from everyone else’s efforts. (Old Weather’s results will be published at Naval-History.net.)

I’m really excited about Old Weather for its own sake, and I’m really excited about what it means for the future of a collaborative, participatory approach to gathering and preserving the primary sources that histories are drawn from. For now, the crowdsourcing approach seems to be working for log books. The next question is whether a project like digitising the Harper's archive could be chunked up and crowdsourced in a similar way. If someone can build that at History Hack Day, I'll be a happy man.