GovLoop - Knowledge Network for Government

You've Got All Our Tweets, Library of Congress. So What? Now What?

One of the big headlines last week was the Library of Congress announcing that it would archive the entire history of tweets on Twitter. We had a robust discussion about it here on GovLoop sparked by a simple question from Harlan Wax "Really to What End?".

I'd like to ask two more simple questions: "So What?" and "Now What?"

Having a bunch of data doesn't do us much good if we can't access and organize it. With that notion in mind, I have a potential answer.

The Library of Congress should run an apps contest, inviting developers to make it much easier to search, segment and publish tweets.

Some ideas to flesh out this vision:

1. Create a user-friendly interface that enables people to quickly search and find tweets based on any number of parameters - geography, hashtags, topics/subjects, time periods, etc.

2. Allow us to quickly flip the tweets in real chronological order.

3. Enable quick publishing of the search content into "digital books" - attractive HTML or PDF versions that retains formatting such as people's Twitter photos - like TweetDoc, only with an unlimited number of tweets.

What would you add to the app requirements?

What do you think of the concept?

Views: 42

Tags: 2, tech


You need to be a member of GovLoop - Knowledge Network for Government to add comments!

Join GovLoop - Knowledge Network for Government

Comment by Christopher Whitaker on April 29, 2010 at 8:39am
They would have to have some way to catagorize all the tweets. I follow many news organzations because it's a quick simple way to get headlines. I've also used it to follow events in Iran. Those tweets would be much more useful than say, my tweets about how my Sam Houston State Bearkats just beat the crap out of our archrival on national tv. I know there are hashtags, but they arnt in every tweet
Comment by Sheryl Grant on April 29, 2010 at 8:28am
For anyone still interested, Matt Raymond at LoC recently posted The Library and Twitter: An FAQ.

Seems that they will be working out research policies in the upcoming months, so it's hard to know what kind of access people will have to the collection. Matt doesn't mention anything about direct messages, either. Here are a few points that caught my attention:

"Private account information and deleted tweets will not be part of the archive. Linked information such as pictures and websites is not part of the archive, and the Library has no plans to collect the linked sites. "

"The Twitter collection will serve as a helpful case study as we develop policies for research use of our digital archives. Tools and processes for researcher access will be developed from interaction with researchers as well as from the Library’s ongoing experience with serving collections and protecting privacy and rights."
Comment by Sheryl Grant on April 23, 2010 at 11:25am
A lawyer once told me that best protection for copyright before photocopiers was tedium. Public data that is made even more public strikes me as a similar issue. Yes, the tweets are public, but it would be tedious to make any meaningful sense of the data manually. You could scrape the information, store it, and probably figure out a way to mine it, but I'm guessing you would be violating Twitter's ToS. If you try to do that with Facebook, they'll show up at your house, threaten to sue you into bankruptcy and break both your legs.

Our data is already mined to within an inch of our lives, from search logs to personal email to social networks, click streams, GPS, credit cards, I missing anything? So Twitter hasn't done anything that hasn't been done in some way before by similar companies (or government for that matter). I wouldn't be so bothered if Twitter scrubbed the data, but for some reason no one called me to ask my opinion.

Twitter has actually been relatively late to the privacy debate. Usually it's Facebook taking blows, mostly for changing users' default settings when it changes policies and terms of service (that are relatively vague and unnecessarily difficult to follow). Basic message, however, is that "public" means "this isn't mine" once you sign up and share your data.

Ironically, I benefit from that as a researcher, but it irritates me to no end as a user.

Here's a good overview of the issues (from the perspective of Facebook and user data):

Go ahead and read it if you want to feel bad ;)
Comment by Andrew Krzmarzick on April 23, 2010 at 8:20am
@Ari - If you tweet, it's public record, eh? And any stranger can follow you now and see your metadata. So what's the difference?
Comment by Ari Herzog on April 22, 2010 at 10:49pm
I have a problem with this.

You: The Library of Congress should run an apps contest, inviting developers to make it much easier to search, segment and publish tweets.

My rewrite which illustrates something different: The Library of Congress should run an apps contest, inviting strangerss to make it much easier to search, segment and publish information from you that you never gave permission to strangers to see.

And I'm not referring to just the Twitter messages, but the metadata contained within those messages; such as locations I tweeted from, browsers and applications I used, and pictures of me. Especially pictures of me.
Comment by Sheryl Grant on April 21, 2010 at 9:34am
@Andrew, I work on a project that runs a competition each year (we're partnered with National Lab Day), and it's a surprising amount of work. But I think your idea to have a contest is really interesting. I suspect LoC acquired Twitter's archive more for marketing than building + preserving a collection, but if they saw it as a strategy to draw attention and funding to digital preservation, that's a stroke of genius. Maybe the people at IMLS would want to fund a grant to run an apps contest. Sounds like they may need simple basics if the following link is accurate, but a competition would crowdsource interesting ideas.

Twitter Archive is Nothing Without Tools, Funding:

I still stand by my grouchy opinion that we too easily roll over when our data is given to third parties, but I see that horse has left the barn. But if Twitter gets used to further digital preservation/records management practices and research, then ok. I'm in.
Comment by Steve Lunceford on April 19, 2010 at 9:37pm
Thanks for the pointer to GovTwit Andy. Lisa makes a good point regarding shorteners -- since much of Twitter is about sharing links to other content, "naked tweets" without such relevant content made available may lose some of their usefulness to furture researchers...
Comment by Andrew Krzmarzick on April 19, 2010 at 1:58pm
Quick note, everyone: please be sure to check out Steve Lunceford's related blog over at GovTwit:
Comment by Lisa Haralampus on April 19, 2010 at 10:25am
Chris - ask what the thought processes are regarding the url's (tiny and standard)? Will there need to be a matching "wayback machine" for Internet 2006 onward? Without the urls, can LoC searchers make sense of the Twitter-universe?
Comment by Lisa Haralampus on April 19, 2010 at 10:22am
For Feds, I think this has a profound implication on our Federal records management responsiblities. I think that we can draft schedules that say the tweets from our agencies are indeed records - but not records that are "appropriate for preservation" because they are already being preserved by the LoC. I see that as a cost-saving issue for government. I hope that Andrew's idea of takes hold because, for my train-of-thought to work, I have to be able to access the Twitter archive by agency/office/account. I read that LoC was discussing the development of search algorithms with one of its partners - Stanford University. I could also see agencies coming up with a "no private tweets" policy because those "records" would not be preserved and there are other ways to communicate privately. Twitter's value comes from its public-facing communications.

© 2014   Created by GovLoop.

Badges  |  Report an Issue  |  Terms of Service