GovLoop - Social Network for Government

Andrew Krzmarzick

You've Got All Our Tweets, Library of Congress. So What? Now What?

One of the big headlines last week was the Library of Congress announcing that it would archive the entire history of tweets on Twitter. We had a robust discussion about it here on GovLoop sparked by a simple question from Harlan Wax "Really to What End?".

I'd like to ask two more simple questions: "So What?" and "Now What?"

Having a bunch of data doesn't do us much good if we can't access and organize it. With that notion in mind, I have a potential answer.

The Library of Congress should run an apps contest, inviting developers to make it much easier to search, segment and publish tweets.

Some ideas to flesh out this vision:

1. Create a user-friendly interface that enables people to quickly search and find tweets based on any number of parameters - geography, hashtags, topics/subjects, time periods, etc.

2. Allow us to quickly flip the tweets in real chronological order.

3. Enable quick publishing of the search content into "digital books" - attractive HTML or PDF versions that retains formatting such as people's Twitter photos - like TweetDoc, only with an unlimited number of tweets.

What would you add to the app requirements?

What do you think of the concept?

Tags: archiving, government 2.0, library of congress, open government, tweetdoc, twitter

Comment

You need to be a member of GovLoop - Social Network for Government to add comments!

Join GovLoop - Social Network for Government

Christopher Whitaker Comment by Christopher Whitaker on April 29, 2010 at 8:39am
They would have to have some way to catagorize all the tweets. I follow many news organzations because it's a quick simple way to get headlines. I've also used it to follow events in Iran. Those tweets would be much more useful than say, my tweets about how my Sam Houston State Bearkats just beat the crap out of our archrival on national tv. I know there are hashtags, but they arnt in every tweet
Sheryl Grant Comment by Sheryl Grant on April 29, 2010 at 8:28am
For anyone still interested, Matt Raymond at LoC recently posted The Library and Twitter: An FAQ.

Seems that they will be working out research policies in the upcoming months, so it's hard to know what kind of access people will have to the collection. Matt doesn't mention anything about direct messages, either. Here are a few points that caught my attention:

"Private account information and deleted tweets will not be part of the archive. Linked information such as pictures and websites is not part of the archive, and the Library has no plans to collect the linked sites. "

"The Twitter collection will serve as a helpful case study as we develop policies for research use of our digital archives. Tools and processes for researcher access will be developed from interaction with researchers as well as from the Library’s ongoing experience with serving collections and protecting privacy and rights."
Sheryl Grant Comment by Sheryl Grant on April 23, 2010 at 11:25am
A lawyer once told me that best protection for copyright before photocopiers was tedium. Public data that is made even more public strikes me as a similar issue. Yes, the tweets are public, but it would be tedious to make any meaningful sense of the data manually. You could scrape the information, store it, and probably figure out a way to mine it, but I'm guessing you would be violating Twitter's ToS. If you try to do that with Facebook, they'll show up at your house, threaten to sue you into bankruptcy and break both your legs.

Our data is already mined to within an inch of our lives, from search logs to personal email to social networks, click streams, GPS, credit cards, texting...am I missing anything? So Twitter hasn't done anything that hasn't been done in some way before by similar companies (or government for that matter). I wouldn't be so bothered if Twitter scrubbed the data, but for some reason no one called me to ask my opinion.

Twitter has actually been relatively late to the privacy debate. Usually it's Facebook taking blows, mostly for changing users' default settings when it changes policies and terms of service (that are relatively vague and unnecessarily difficult to follow). Basic message, however, is that "public" means "this isn't mine" once you sign up and share your data.

Ironically, I benefit from that as a researcher, but it irritates me to no end as a user.

Here's a good overview of the issues (from the perspective of Facebook and user data): http://epic.org/privacy/socialnet/

Go ahead and read it if you want to feel bad ;)
Andrew Krzmarzick Comment by Andrew Krzmarzick on April 23, 2010 at 8:20am
@Ari - If you tweet, it's public record, eh? And any stranger can follow you now and see your metadata. So what's the difference?
Ari Herzog, MPA Comment by Ari Herzog, MPA on April 22, 2010 at 10:49pm
I have a problem with this.

You: The Library of Congress should run an apps contest, inviting developers to make it much easier to search, segment and publish tweets.

My rewrite which illustrates something different: The Library of Congress should run an apps contest, inviting strangerss to make it much easier to search, segment and publish information from you that you never gave permission to strangers to see.

And I'm not referring to just the Twitter messages, but the metadata contained within those messages; such as locations I tweeted from, browsers and applications I used, and pictures of me. Especially pictures of me.
Sheryl Grant Comment by Sheryl Grant on April 21, 2010 at 9:34am
@Andrew, I work on a project that runs a competition each year (we're partnered with National Lab Day), and it's a surprising amount of work. But I think your idea to have a contest is really interesting. I suspect LoC acquired Twitter's archive more for marketing than building + preserving a collection, but if they saw it as a strategy to draw attention and funding to digital preservation, that's a stroke of genius. Maybe the people at IMLS would want to fund a grant to run an apps contest. Sounds like they may need simple basics if the following link is accurate, but a competition would crowdsource interesting ideas.

Twitter Archive is Nothing Without Tools, Funding:
http://www.readwriteweb.com/archives/twitter_archive_is_nothing_without_tools_funding.php

I still stand by my grouchy opinion that we too easily roll over when our data is given to third parties, but I see that horse has left the barn. But if Twitter gets used to further digital preservation/records management practices and research, then ok. I'm in.
Steve Lunceford Comment by Steve Lunceford on April 19, 2010 at 9:37pm
Thanks for the pointer to GovTwit Andy. Lisa makes a good point regarding shorteners -- since much of Twitter is about sharing links to other content, "naked tweets" without such relevant content made available may lose some of their usefulness to furture researchers...
Andrew Krzmarzick Comment by Andrew Krzmarzick on April 19, 2010 at 1:58pm
Quick note, everyone: please be sure to check out Steve Lunceford's related blog over at GovTwit:
http://www.blog.govtwit.com/2010/04/19/every-tweet-you-make-theyll-be-watching-you/
Lisa Haralampus Comment by Lisa Haralampus on April 19, 2010 at 10:25am
Chris - ask what the thought processes are regarding the url's (tiny and standard)? Will there need to be a matching "wayback machine" for Internet 2006 onward? Without the urls, can LoC searchers make sense of the Twitter-universe?
Lisa Haralampus Comment by Lisa Haralampus on April 19, 2010 at 10:22am
For Feds, I think this has a profound implication on our Federal records management responsiblities. I think that we can draft schedules that say the tweets from our agencies are indeed records - but not records that are "appropriate for preservation" because they are already being preserved by the LoC. I see that as a cost-saving issue for government. I hope that Andrew's idea of takes hold because, for my train-of-thought to work, I have to be able to access the Twitter archive by agency/office/account. I read that LoC was discussing the development of search algorithms with one of its partners - Stanford University. I could also see agencies coming up with a "no private tweets" policy because those "records" would not be preserved and there are other ways to communicate privately. Twitter's value comes from its public-facing communications.

Latest Activity

I vote for #7! Great piece!
14 minutes ago
Simple: I believe in our mission at EPA. In my 17 years here, I've literally helped save the world by working to protect the ozone layer, helped people understand acid rain, and run our response websites for 9/11 and Katrina. And now I lead our enti…
21 minutes ago
Gov 2.0 Radio and Jamith Peterson are now friends
26 minutes ago
Both Government Workers and Federal Contractors network through Web 2 and 3.0 - this group embraces the federal contractor on GovLoop.
46 minutes ago
J Pessima updated their profile
47 minutes ago
GovLoop added 2 photos
1 hour ago
Stephen Peteritas added 2 photos
1 hour ago
Gwynne Kostin added a discussion to the group Music Sharing
We're swinging into the final weekend of summer. Looks like we are safe from the hurricane, so to keep the season going as long as we can, time to list your favorite summer songs. What makes you feel like school's out? What is a song that defined a…
1 hour ago
I.J and Lauri Stevens are now friends
1 hour ago
Adriel Hampton added a blog post
San Francisco's aggressive open data efforts were on display this week, as civic and technology leaders took the stage at sf.govfresh, an event highlighting technology innovation in City government. City CIO Chris Vein (who also was recently intervi…
1 hour ago
I want to have both my votes to go to # 8 good job Kyle
1 hour ago
3 hours ago

© 2010   Created by GovLoop.

Badges  |  Report an Issue  |  Terms of Service