National Digital Newspaper Program Lightning Round Talks 2012

National Digital Newspaper Program Lightning Round Talks 2012


Oh really, you’re too kind. So, I’m sounding a bit like a frog this morning, so I’m really glad that this is the session that I signed up for. So, show of hands: how many people here attended metamorphosis at UK? Ah – that’s so awesome. So, we did that for about 4 years, and to follow up on what Errol said, we came to do metamorphosis because we had done everything in house, and doing it the hard way, crazy as it seemed, it taught us a lot. And what it allowed us to do, then, was to be able to go teach other people how we did it, and so many things about the program have changed since then, but what we decided to do last year was take what we had done in person and turn it into a series of teaching tutorials that anybody around the world can access as long as they have internet access. So, here’s our home page; what we’re going to do – not all of the tutorials are done yet – we’re designing these so that anybody at any level from any institution will be able to get something from it. Let me go to the “About” page, and we’ve got a couple of short films here. I think somebody mentioned the Kentucky edition film that we had done before, and then we’ll say a little bit about our tutorial presenters. Notice that Becky is an avatar, and so is Errol. [ laughter ] And so is Deb. And, what we’re doing a little bit differently with metamorphosis this time is, when we did it in person at U.K., it was all U.K. people who were doing the teaching, with the exception of Errol and Barbara Taranto, that came in and did some sessions on RFP, and I think Deb came and spoke about some stuff. What we’re doing differently with this is we’re reaching out to the community. One thing that I realized yesterday, you know, when we found out that there was going to be an end for us. You know, we were very concerned that once the graduates left, that you know, a lot of brain trust was going to walk out the door. And, what I realized yesterday from the sessions is that’s not true. So many of you have come along and you’ve found your own way in the world- in the world of newspapers, so we’re taking that and we’re going to reach out to other people; some of you who have excelled in particular areas. And so, metamorphosis is becoming less U.K.-centric, and more NDNP-centric, because we all share a great body of knowledge, and you know, I think we’ve proven that there’s no one particular right way to do something; there are various ways to do it. So, Deb and Nathan from LC, I recorded their sessions on Tuesday. Santi [Thompson], for those of you who were here last year, you saw him do a great presentation on working with vendors, and this session isn’t quite as funny. But you know he’s Santi and he’s still funny. So let’s go over to the tutorials themselves and this will give you an idea of what we are going to be offering, and you’ll notice the ones that say coming sooner the ones that I haven’t finished yet, I’ve got a lot of recordings so far but it takes a little while to put the images with the voices. So it’s a work in progress, but another thing that we are doing with this as well is quizzes. Because sometimes when presenters are, you know, you can write out your script and be very thorough, but every once and a while you get somebody like me who inevitably forgets something, and so we are able to in these quizzes we’ll quiz you that are coordinated with the tutorial we’ll quiz you on that tutorial and if there is any other information that maybe wasn’t explained thoroughly or was perhaps left out we can include it in the quiz. So, it’s not just a quiz – it’s another learning tool. The bad thing about these right now is that they require Flash. Yeah, they were built with captivate, so we might have to. It’s a work in progress, did I mention that? So the other thing that you should know is when you click on one of these it’s going to take you out to our YouTube channel. The meta morphosis YouTube channel, because the server this sitting on isn’t a streaming server so at least not for now. They are all under ten minutes, and you will notice, you will notice something. Where did it go, where did the site go? There it is, I have completely lost my train of thought has anybody seen it? Well, whatever it was, it wasn’t real important. Then we have a resource page over here and this will lead you to a lot of the information that NDNP has on their website and some other things as well. This is also a work in progress, so let me just say while I still can that if you have resources that you think will be helpful to other people, let me know and I can include it on this page. For those of you who are new to the program, you get to be my guinea pigs. And I already got an email from Nevada and they had come across the site already and so they’re also a guinea pig. So it’s really sort of interesting, you know, we have a house full of people here who have never been in NDNP but are now, and then we got this other state who really wants to be in NDNP. So its lining up to be a perfect storm. questions? Okay, I did a really good job I guess. [ laughter ] All right that’s it [ applause ]. Okay, again, I’m Lauren Charney I’m from the Digitizing Louisiana newspaper project. And this is probably more beneficial for the new awardees. It’s an alternative way of kind of handling all the information that you are going to be distributing and sharing with the various people who work with you on your project. I know a lot people like to use things like base camp and other type of software for project management. I’m not so much about that I really like not use things that are very easy for our advisory board to use. They don’t have to learn anything, other members of our DLNP team, they already have a lot of stuff they have to do and if they have to go in through something like base camp they are going to riot. So I like use different type of media that allows for much more asynchronous collaboration. Because it is absolutely necessary not only because the people that we work with in-house all have other duties like we have the assistant curator of books, he writes our essays. We have people who are working in our microfilm processing they are not officially just working on our project they have other things that they have to do. So I want to be able to share information with them but I also want to be able to let them kind of access that information when its useful to them. Additionally this is where all of our advisory board members are, we actually only have 3 in the same city with us. We do not meet ever as one group in person so making sure that we can distribute that information as easily as possible is necessary. And we chose to share almost all of the information using a wiki. We use PB works wiki we were able to share everything from how to contact us from the conference call, how to get in, PDFs long list of all those 600 and something titles that were eligible. Details about how they should grade things. We chose to use the wiki because it was super simple for them it’s a very familiar interface for most of them. It looks a lot like a normal computer, you have the wiki page you can go through pages and files that looks a lot like your file folders on your computers. You can set different levels for writing and this was important to us. We didn’t want anybody to overwrite anything. Members of our advisory board had various levels of tech skills, so we didn’t want to give them too much power. And then you know you can also set settings. We wanted to make sure that our stuff was open only to the people that were supposed to be in on it so having a work space that we could close down from the general public so they don’t come stumbling across it or finding it on Google was important to us. So again these are the advisory board resources we can share our surveys which is how we gathered information and again all the lists. It is very user friendly, we have done it in 2009, 2011 we had great feedback from our advisory board. They were very comfortable with it and that was important to us. Other options you can use Word Press, blogger, Wiki spaces, Google sites also have a wiki site. These are all free they all have various terms and conditions that you might want to make yourself familiar with. And basically the biggest thing is again having privacy, allowing people to write but not letting them overwrite your stuff and being able to share and upload files. Additionally, when gathering information we chose the Survey Monkey model, I know not everybody likes it, it’s how we chose to do it. We basically asked the same two questions for all of our titles, they had to judge all the title on their own merit not against each other. We didn’t ask them to rank 600 titles in order of importance we just wanted them, to know, or wanted to know from them what they thought had the most research value and what represented the greatest diversity. And then we kind of took those numbers ourselves. What we liked about Survey Monkey is it had various levels of service, we chose the select, $17 per month we used it for 3 months and then we can get rid of it. And the only reason we chose this over the free plan was because we could lock it up so nobody can use it. And it provided us with PDF and Excel files when we were done. So it kind of did a lot of the calculations for us so it took some time off our hands. Other options for gathering Survey Monkey, not a lot, if you want to replace it with something like E-Survey Pro you can, I did some research trying to find out what other survey options there were out there. and almost everybody online seems to say go Survey Monkey. There used to be Zumerang recently bought by Survey Monkey so, get over it. E-Survey Pro is very clunky a lot of people don’t like to use it so that’s our option. Finally we can’t do everything asynchronously as much as we like to, we do eventually have to talk to each other so we choose Doodle to kind of help set up our meetings because we can do that asynchronously. Using Doodle for scheduling is very simple, you set up an event, you tell it whatever you want to be, who you are, what your email address is, what the dates you can do it on, what times are available. Then you can either send the invitation out yourself which is what I prefer to do. I don’t want to have to open a new account every single time I use some sort of social media. So Doodle is awesome because when I want to build something, they give me the link and I can email it out, I don’t have to register with them. So being able to do that we can send the link out, we can edit it and people can tell us this is what they’ll get, what times their available and from that we can determine when we’re are actually going to have the meeting. Other options Whenisgood.net, Congregar and Timebridge. Tumblr used to be big but they are going to be they are no longer going to exist after December 3rd so sad news for everyone. I mean generally we just use it because it’s smarter not harder, we don’t want to have to make our advisory board have to do too much more heavy lifting then they are already doing. They are doing us a service, we wanted to make life easy on them. And yea that’s pretty much all I have, do I have any questions? Okay well thank you. [ applause ] Good late morning everyone and aloha, my name is Erenst Anip I am the Project Manager and Social Media Lead for the University of Hawai’i at Manoa Library. I will be talking to you about this wonderful long title you can read on your own. Basically our University or our state has been with NDNP for 3 phases. This is the start of our third and we kind of have a big learning curve at the first year but after that we feel more comfortable with the technical parts and so we don’t really have a lot of money or myself don’t really have that much technical expertise that we can create like wonderful fancy portals and all that. So I focus my effort on the outreach part and adding value or value added content to Chronicling America. And this thing does not work so, or I didn’t figure it out. So what is our existence on the internet with the Hawai’i Digital Newspaper Project? We have the wiki which is a very loosely def- loose definition because we just use Google site. Very basic, we just want to put all of our technical and administrative matters there to begin with. But eventually it becomes a one stop shop, we started to add stuff like events that we attended, talks that we’ve given. And also added value articles that I will talk about a little later. And we also, so this is the screen shot as you can see we have like our team members plan of work our work flow is also documented here, and other wonderful stuff there. And then we have our Lib Guide which started in 2010 and this is under our library website structure so it’s easier to find. And this is more for content and search strategies for people. And again we are slowly moving our topic guides and articles into the Lib Guide. And this is the screen shot from it we see project overview, search basics, also the 2 things I have mentioned briefly and some technical info that connects back to the wiki. And what are some of our value added content? Definitely we have the topics in Chronicling America which has already been there since our first phase. And yesterday probably we have our hula on the main land project featured in the NEH site, which is awesome. And we also added, we have more topics in our wiki like the leprosy, Pearl Harbor, and statehood that we have covered extensively in that wiki. And also the historical featured articles also in our wiki, and there are 3 different sections of it. So we have stories, newspapers sections and ads. And so things from very serious stuff like digging up bones of native Hawai’ian warriors in the temple to more light stuff like obitu no that’s not light at all. [ laughter ] obituaries which is not light and bloopers in newspaper layouts you’ll see like the newspaper layout is upside down or left to right. I’m sure you have seen that in your own newspapers, and Hawai’ian music records and other things. These are mostly text based, and now we also have a flicker collection set because we don’t have the funding to go to the article levels so we do it kind of manually with graduate students and myself. So we created these 6 different sets currently. We have recipes from back in the days. Photos, administrations, quips and quotes and bikes in the news and ads. And some of my favorites are how beer is very healthy. I’m sure everybody here agrees with that. And also how news can be very fun back in the days. [ laughter ] So, this is before the 24-hour news cycle happened, there we go. And in terms of social media outreach we repost what we received from LC, now 100 years ago in somewhere in the U.S. We also come up with our own local 100 years ago in Hawai’i. And when there’s relevant events, major events like the Queen’s birthday or the Titanic sinking which was a big hit, when we posted it up also when the Olympics we highlighted Duke Kahanamoku’s win of the gold medal in swimming back in the days that was also a great hit. And this is just a screen shot of our Facebook and Twitter and these two are tied together so we posted on Facebook and it would shoot straight to our Twitter also so we saved time that way. And these are some of the things that we have posted relevant to Chronicling America. In terms of outreach these are a litany of all the things we have done we’ve integrated our resources or Chronicling America to resources for library instructions., Presented in state wide conferences also we presented at PIALA which is the Pacific Island Association of Libraries and Archives Conferences so that’s like the ALA of the Pacific. And we brought this possibility of perhaps bringing, having a collaboration project with some of the Pacific Islands and perhaps hopefully Deb can come to one of those in the next few years. And we presented at LIS courses, Library Information Science. road shows and also since I’m involved in the orientation team for the library too, we included the bookmarks and other materials that have been provided by NEH and LC in our welcome package. Just to put everything integrated to our welcome package. And we are in the process of creating our own state-wide materials like bookmarks, posters and other swags. And this Arizona bookmark is one of our inspirations right now so if you don’t have yours pick it up, its outside or talk to Yin about it. And then these are some of the interesting ways people have been using our resources. I am sure you have seen the data visualization project by Cyrus. Anybody seen it or heard about it and how he just put all 4,000 pluses of the first page of Hawai’ian Star and just put like a time lapse and he basically came up with this analysis of all the wonderful stuff that he can pick up from there. I don’t expect you to read it from there but you can look it up yourself. we also have currently in the library an exhibit about paddling in Hawai’i and how paddling is huge there and these are the intro banner for the exhibit and these are blown up to maybe 6 feet high and as you can see kind of all the newspaper articles and cut outs are all there and cited so, we know that Chronicling America resources are being highly used in our state. And we also have our own library exhibit for the project the Hawai’i Digital Newspaper project this is in the Hawai’ian Collection. and since we are good at digitizing the newspapers we are not really good at laying out exhibits or taking good pictures from above that’s why there’s like you can see the lights right there. Please pardon the photography skills. Also just news entries and blog entries that are wonderful like the Maui news found out hey I think they Google’d themselves and found hey this is our first issues back in the 1800’s so they did a coverage of that when they were doing their century celebration and also @Hawai’i the owner of that Twitter handle also updated about our project, and he’s like the social media person in Hawai’i so I think that’s quite a big deal. And also these interesting blogs, like this is about the Portuguese in Hawai’i and as you can barely see there, there’s at least 3 pages of search results that talks about Chronicling America. There’s this blog about news that is not covered in the media and it goes into very much in detail and very citing all the articles or newspaper pages in Chronicling America about energy and power in Hawai’i in 1850-1893. And we also have Nupepa they connects, they connect the our English language newspaper digitization in Hawai’i with the Hawaiian language newspaper effort also in Hawai’i which is a crowd sourcing base and how he tried to tie the communication in the news with between the two of them. And there’s quite some few few interesting entries from there too. So and this way we are using our librarian skills to kind of figure out oh these are how we people have used their sources from Chronicling America. So that’s what we have been doing in Hawai’i and thank you very much. If you have any questions? Wonderful, first of all, we’ve got a wonderful slide show about the whole Gateway to Oklahoma history. that Sarah Lynn and Malory from our staff created. But I’m not going to go into that cause that’s all the details and guts. If anybody wants that I’ve got it on a portable drive, and I can show you that, , what really I want to talk to you about is not about the Gateway, but I do want to talk about the Gateway but not really about just about the details of the Gateway but how it happened and why it happened and everything that we do is all about partnerships, collaboration and influences. And the gifts I gave, Kansas they before 2009 when we really got to looking at NDNP, Kansas was starting to put things up on their website and we saw that and we interacted with that internally and said we want to do that kind of thing. And and then you know that kind of nudges from above and then you have the University of North Texas who diligently, you guys need to be with NDNP. And it was just you know that internal struggle of years of the way things were done and now everything we do is about these partnerships because we couldn’t do any of it without the partnerships. And so that’s kind of what I want to talk about. 2009 we applied and received a NDNP grant and that basically changed everything. I mean when we were able to go to our executive director and show Oklahoma papers on Chronicling America you know it was my gosh what can we do next. The great thing about the historical society is we have great collections we collect and preserve but we weren’t that good at sharing and you know seeing what’s happening on Chronicling America and what everyone was doing, what Kansas was doing, the University of North Texas was doing it was why can’t we do this we have 5 million pages of pre-‘-23 newspapers in our collection. We own those collections so we have a big advantage of not having to go out and find these, so it was okay let’s find a grant to make this happen to be able to get all these pages eventually up and available. And so we went with that we applied for that, a grant for that. Excellence and ethics in journalism and the initial grant wasn’t even to make things OCR searchable. We had applied for this and talked to them before NDNP it was just to basically get away from the microfilm readers. To get it where people can look at them online. And of course with the experience of NDNP and Chronicling America and working with the University of North Texas we found we have to make this searchable. We have to. So we changed internally how we were doing this, found more funding and made it happen. And now the Gateway to Oklahoma history is up. We are just going to zip through and you can see this wonderful Sarah Lynn and Malory did this wonderful program back. give me a second. Okay so this is the Gateway to Oklahoma history and it’s really just a continuation of Chronicling America and what you and everyone at NDNP has done we collaborated with the University of North Texas who is our partner with the NDNP grant so we created this and it really allows us to stick to what we do best which is collecting and preserving. But the experience with NDNP and Chronicling America and the University of North Texas, we have made giant steps in technology. We have two nextScan conversion machines so we do the conversion on this project where on NDNP its iarchives and the relationship with UNT has with them. So we do everything we do on NDNP but we do the microfilm conversion and we have in-house microfilming so you know we can do it all in house to the point of shipping off that data to the University of North Texas where they take it from there and do the OCRing and they house and run the entire site. And they are wonderful its seamless from somebody goes to our main website and this is actually housed in at UNT but its seamless on with the way it looks and everything and it’s all about you know specialties. You know they are incredible on what they do, we’re great on what we do about collections and we just it snowballs from there. I’m just horrible with these things, once we were able to show people the Gateway to Oklahoma history and what we can do one of the biggest paper in Oklahoma The Daily Oklahoman. And they were looking for a home for 1600 volumes of the Oklahoma City Times newspaper project. I have a poster about that, how that happened it was just its all about partnerships and relationships. they had nowhere to put these, they wanted to get rid of them, they were probably going to throw them away. We partnered with the Oklahoma Department of Libraries they had warehouse space, we didn’t have the space to even house them. But we can microfilm them, we can scan them and then with UNT we can put them up on the Gateway. And this will be a long term project but it’s something that would have just been destroyed if it wasn’t for the partnerships and you know that’s and the collaborations and the influences which goes all the way back to Kansas and us going look at what they are doing. And you know that type of thing so but with the Gateway it’s all about getting all of our pre1923 papers up eventually and with the granting institution, we went back to them and said hey you know we can do this and just stick all of these digital images up but it would be better to look at this, look at Chronicling America this is how we want to do it. And so they gave us an extension on how long it will take us to do it in exchange for us adding the funding and the ability to make it OCR searchable and that’s the direction we went. so then it just snowballs from there other projects that just because we can show Chronicling America and the Gateway then there’s what we go to the Oklahoma Publishers Association meetings we have a really good relationship with them, we microfilm all of the papers in the state for the last 80 years or however long microfilm has been, way before I was here. but so there’s 21 newspapers that are ready to give us those post-‘-22 copyright to go ahead and put those up on the Gateway. And you know that it allows us to be able to partnership with UNT and others, to go out and have these other relationships and put our efforts in to getting them to let us do this, rather than the huge technical side that we are not quite ready for so that’s how that works. And just like the Oklahoma City Times its 84 years you know 1984 is when that ended, they are giving us the rights to put all of that up, which you know we can do with ’22 and before but given us those rights are just huge, where am I at, one minute my gosh so another with the same company seeing what we did with the Gateway we talked with them they are donating their entire photo collection which is 1.5 million photographs and we are outsourcing the scanning of that and having UNT put those on the Gateway. So this is not going to be just a pure newspaper platform it will handle everything that we do, film and video, more photograph collections, oral histories, maps, Indian records things that we have on microfilm we will digitize and put on this platform. and this is only possible through these partnerships with NDNP and Chronicling America and Kansas and UNT and other organizations within Oklahoma and outside Oklahoma so let’s see if that’s everything. That’s everything; any questions? Hi everyone. We are just going to tell you a little bit about some of the outreach activities that we are doing. this is kind of focused on how we reach different audiences in Vermont and beyond. So let me get started here, this is some of the ways that we reach audiences we have virtual and in-person communications. There’s a little screen shot of our blog up there which I know some of you have seen but I try to update it you know every week or two to keep new material coming through. We write about content that’s in the newspapers, highlighting different things that have happened like 150 years ago obviously is of great interest. We also do 175 years ago because we have newspapers that are from the 1830s so we can pull great stuff out of there too. Of course we have a website we use social media Twitter, Facebook we do well we are working on doing a webinar with the state library to educate primary school teachers, high school teachers in the state of Vermont as to how to use Chronicling America. We also recently produced a video that explains what we do that will be used at the University for history classes so outreach that reference librarians do at the University to history classes they’ll be seeing that video so they will understand what role we can have in their research. To an extent we are involved in classes some in person stuff there, conference talks and general audience talks. So I’ll just go through a couple of things and then Birde will go a little more in depth in our brief period of time that we have. So looking at some of our statistics we had 7,400 views since the blog began and our Facebook page we have over 250 views in the last 3 months. So we are getting some traffic on the Facebook page and quite a bit on the blog but I think that we are hitting different audiences there because on the Facebook page you have just blurbs and somebody who has a couple of minutes to look at something will look at the Facebook page and get a notion of what’s going on there. Whereas with the blog we tend to write articles so someone who might spend five to ten minutes on the blog versus less than a minute or maybe a couple of minutes on the Facebook page. So when opportunities come up to talk about the VTDNP we always try to say yes, so that’s kind of the basic message here that I would try to get across is that if there is something going on go ahead get out there, try and have a slot in it and try to talk about the great work that you are doing. So talks lead to other talks, I gave a general interest presentation at the University of Vermont which some one saw who was with the Vermont Genealogical Society and invited me to give a talk next month to their group so someone there might invite me to do something else so these things lead to other things. So on that note I’m going to hand it over to Birdie who will talk more about this. Hi yeah I think from the beginning of the project we were really excited first of all to be funded. And at the University of Vermont we have people who will work just on communication to convey projects that the library is working on and library services and projects. So we were really excited when we first got funded so the media picked up on our funding and the Library issued a little excerpt on our website about the funding. that in turn was picked up by University Communications and in Vermont we are involved with several state partners the State Historical Society, the Vermont Department of Libraries which is the state library. We are working with them very closely on microfilm, they have the negatives. And the also the Ilsley Public Library in Middlebury and our partner Chris is here too from the Ilsley. And kind of simultaneously everybody launched some kind of press release about the grant and that was the initial part of really building and audience to let people know what was coming. So gradually as we were building this audience we started to, there was a lot of interest about well when is the content coming and we used Deb Thomas came and visited us for the progress check in the spring just as production was getting under way. And we capitalized on her visits, we put that out in a press release at the university and so different people came from around the state. And one of the people was a journalist from the major daily in Burlington where we are based. The University of Vermont is based, Tim Johnson. So as content became available the media picked up in and of themselves, Tim Johnson came back following Debs talk when we announced that content was available and different newspapers the local dailies picked up and WCAX the local news channel also did an article about the project. Vermontdigger.org is a it’s a website powered by journalist and they picked up and did a nice article on the project. So it’s just to say that news about the project is worthy, people are interested I think we are able to transmit our enthusiasm and excitement outside. And that people have talked about this snowball effect, I think that’s been the theme since the beginning as more content has become available people are more interested. So then people have come to us and approached us for talks and so this is just a list of some of the things like the Hawai’i project we did the state well we were invited to go to Massachusetts and talk about the project at the Massachusetts Library Association. We did a Pecha Kucha night some of you there was a storm last year, tropical storm Irene there was a lot of flooding and major damage around the state. The Fleming museum is a museum on campus, they were doing a Pecha Kucha night. You have 20 slides 20 seconds per slide its really fast paced, it was really fun. A group of us worked on that, Tom and Prudence who’s not here but who’s pictured. the New England Library Association was based in Burlington so we pitched a proposal to talk about the newspaper project. a brown bag lunch in the library, we also pitched a proposal for the North America Serials Interest Group in Tennessee this year. So the pictures are from there was a, the Vermont History exhibit which was in southern Vermont we are based in northern Vermont. which was I think it was held in a barn in the Tunbridge fair grounds the world famous Tunbridge fair ground. But so we set up an exhibit there and that drew 4-7 thousand people and so we it generated a lot of interest in the historical community so just trying to do diverse outreach. we’ve done some Tom did a K-12 tutorial and as he mentioned the video and the genealogical society talk is coming. As part of that marketing strategy too, to talk about the project we worked with a graphic designer to really develop a theme around the a logo and a banner so the logo we are using on our website and Facebook pages. And the old hand pressed with Freedom and Unity which is the state logo. So it’s just kind of gives us some images to talk about and get behind to rally the cause for the project. So again the snowball effect we found that just starting small just talking about our project because we were happy to just get funded has lead to other opportunities. The media picked up on our story and as a result we’ve gotten invitations to do different presentations. So yeah that’s our logo; start small, respond to opportunities, answer calls for participations if you see calls for papers or participation. And watch for this snowball effect I wanted an image because it’s not just in Vermont and I was glad to hear Chad from Oklahoma say the, but it’s just this cumulative effect of this little ball of snow going down a hill and gathering momentum and mass and surface area and growing and becoming something people are really interested and engaged in. And most importantly coming to Chronicling America and the content that’s available. Another result that we had was just last month we’ve had requests from content producers who have seen content, Civil War content, I think one was for Lincoln yeah the Lincoln and what was the vice president’s name? Hamlin, the results of the 1860 elections showing President Lincoln as and Hamlin as the winners. that’s going to be used in a documentary on our website. And we got a request from somebody working on a PBS documentary on abolitionism which is great because Vermont was the first state in the Union to outlaw slavery so and they found it on Chronicling America so so that’s it [ applause ]. Okay I’ll just get started when I started working on the Montana Digital Newspaper Project 2 years I was astonished by the variety of content I was seeing in small rural Montana newspapers. On investigating I stumbled on an industry that has been widely known in journalism but that I had never heard of. The phenomenon is called Ready Print, sometimes called patent insides because of the advertising space sold to the patent medicine makers. How it worked is this I got to step away from the microphone. I’ve got an entire page of preprinted general interest content that was written, edited typed set and printed in a central location of Chicago. Shipped to a rural editor in Montana, Idaho or Iowa. With a blank reverse side. So the local editor then he printed his local news on that blank side. And when he told me that he ended up a nice respectable four page weekly newspaper when he only had 2 pages of local news. That’s how it worked, from about the Civil War to about 1876. the benefit to the rural editor was profound, he could publish a respectable 4,6,or 8- page weekly paper and survive financially because the use of Ready Print cut his production cost nearly in half. At the same time he could offer readers a type of content until then found only in urban dailies. Ready Print spanned, let me just switch down here, here we go. Just to make one quick distinction here when newspapers were participated in a newspaper exchange they could copy information or articles from another newspaper, re-typeset it and print it in their own paper but when they did that 99% of the time they credited the other newspaper so you’ll see Chicago Herald or something down at the bottom below that little blurb. When they received paid content from the American Press Association, the APA insisted on having their copyright line. That’s not Ready Print. Ready Print, 99% of the time, is completely unattributed – there’s no editor, no source, nothing. Ready Print is typeset once by the supplier, so when you see it in multiple papers, it looks exactly the same; the column width’s the same, the column break is the same, the illustration is embedded in exactly the same way. The content is not date-dependant; it’s current in April, it’s relatively current in June. It’s rare to see one that’s more than a year old, but usually they all came out within the same month. So, here’s an example of the Salt Lake Herald having- this is Ready Print fashion column; the same column appearing, I think, on the same day at the Morning Call in San Francisco. The same column appearing the same day in the St. Paul Daily Globe. So, that’s an example of Ready Print in Chronicling America. Ready Print content spanned a range, serialized adventure and romance stories – so-called the “women’s pages” – features for children, farming information, science, profiles of world cities, biographies of famous people. Let me just go through a couple more here, there’s an article of the origin of rubber appearing in a Montana paper in May of 1905. Note where the illustrations are and the column breaks are; the very same thing appearing in a Kentucky paper in April 4 years later. Well, they took their time with that particular piece of Ready Print, might have been sitting on the shelf for a long time before that Kentucky editor needed to use it. You will note, though, that this appears on the right side part of the page, this is appearing up in the upper left. That’s because this is, by 1909, coming from stereo plates, not pre-printed, but the plate is coming to the editor. I’ll talk about that in a minute. The Ready Print suppliers were shrewd when editors complained that pre-printed pages stood out because of their different column width. The Ready Print firms made the same content available in any column width the editor requested. If an editor commented that Ready Print typefaces didn’t match his local typeface, he was offered a remarkable deal: he could box up all of his existing sorts and ship them to Chicago. In return, he would receive a brand new complete set with which to fill his type cases in a face that matched the Ready Print to which he already subscribed. After stereo-plating arrived, as I mentioned, editors could order either pre-printed whole pages, or the same page shipped as a thin metal plate. The editor could cut the plate apart with scissors, and thereby make up his own pages, mixing Ready Print and local content on the same page as he pleased. That’s what happened with, the origin of rubber. Incidentally, this led to some criticism of rural weeklies as being quote, “edited with a saw”, unquote. In this case, of course, the local editor bore the entire cost of ink, paper, and printing labor, but because type-setting was by far and away the biggest expense, using the Ready Print boiler plate – that’s the metal plates that they would receive in the mail – this still saved the local editor considerable amounts of money. Many rural weeklies owe their existence to the Ready Print industry. Here’s another example from stereotype, a funny article called “Odd Occupations”. Here it is in a Missouri paper, November 16th in Savanna. A different Missouri paper on November 16th, but then, if you’ll look at that first line, the New York correspondents of the Chicago Tribune writes, “we are fast”. Here it is again, but the first line is, a New York correspondent writing about odd occupations in that city, says that the wagons of the…blah, blah, blah. So, these two are from the same source; they were typeset once. This is from a different source. The first line was rewritten and the entire piece was re-typeset, as you can see by the line breaks. Here’s another example of Ready Print: “I Was Swallowed by a Boa”. This appeared on November 17th in Utah. It also appeared in November or December, I can’t tell which because they don’t have dates or pages in Montana, page numbers. In the Phillipsburg Mail – Phillipsburg Mail has been a real challenge – but anyway, you can see that it’s the exact same- it was typeset once. The columns are the same and the illustration is embedded there in the text in exactly the same way, but each editor put it on a different place on his or her page. There’s been one author, in particular, has made a point of suggesting that the Ready Print industry was a giant scam that was imposed on rural readers, and that they didn’t realize this content was not being written down the street by their local editor. For me, it’s- I don’t entirely buy that, I think readers in Iowa were more sophisticated than that, and if they saw an article, “I was Swallowed by a Boa”, I think they could reasonably assume that did not occur in Des Moines. Okay, let’s see here. Intense competition among Ready Print suppliers forced prices down until Ready Print costs five dollars a week, only slightly more than the price of blank newsprint. The Ready Print firms could afford to do this because the bulk of their revenue came not from client subscriptions, but from advertisers eager to promote their products to a vast and growing population of rural readers. Remember that prior to 1930, most Americans lived in rural areas. This audience was estimated in the early 1900’s at 25 million – bigger than the circulation of any single newspaper in the world. In the U.S., between 1870 and 1900, 25,000 new newspapers were founded – over twice the total number of papers existing in the rest of the world combined. In the 1880’s alone, the rate of growth averaged two new titles founded every day. In the same period, the number of Americans who subscribed to a newspaper tripled. Along with this extraordinary growth in titles and readers, the growth of Ready Print exploded. In 1865, A.N. Kellogg – and I’ve got a couple of his advertisements that he advertised or put in trade magazines for newspaper editors. When he started in 1865, he supplied Ready Print to five news-papers in Wisconsin. By 1878, there were 22 Ready Print suppliers serving 2,000 papers. By 1925, industry consolidation left one survivor, that was the Western Newspaper Union out of Chicago. This is an example of their advertising; these are sample sheets that they sent to editors to encourage the editor to subscribe to their agriculture columns. You can see the editor could choose which of these heading styles they preferred. The Western Newspaper Union ended up – based in Chicago – had 13 regional printing plants producing 15,000 weekly- Ready Print for 15,000 weekly papers in the Midwest, South, and West. Historians speculate that every copy of a country paper was picked up and read, at least in part by five or more different individuals. That’s, by one estimate, at its peak, Ready Print content was reaching 60% of the entire U.S. population. So, again, this is the Western Newspaper Union; not related to the Western Union telegraph company, with examples of their agricultural columns. Most of this content actually was just stolen verbatim from the U.S. Department of Agriculture press releases, and then typeset in this manner. And then, here’s two papers in which this is showing, you’ll see the top editor has chosen the livestock news column up at the upper left, and the road building at the right. The editor of the lower paper has livestock news at the left, the dairy there in the middle is also a Ready Print column, and the poultry at the right. For some reason, raising chickens was always part of women’s section. I don’t know why, but anyway, the editor at the bottom paid more because he’s getting 3 separate columns of content. Okay, so how much Ready Print is in Chronicling America now? Well, I’m trying to find out. It’s not easy. There’s some- I’ve shown you some examples; there’s going to be more coming as more states come online. Some of the challenges I’ve had to figure out, how much is theirs, that obviously, our content in Chronicling America is growing. Even if I identify a Chronicling America article and try to search for exact phrases, since the OCR’s differ from one vendor to another, I may not actually retrieve all examples of it. Early Ready Print consumers, Iowa and Wisconsin were big buyers of Ready Print, early on, 1870’s. But they’re not online yet, so I think when they come online we’ll see more. Selection criteria may trend away from rural weeklies because in some states, they may lack the significance. In Montana, rural weeklies are very significant. We have sparse population, very widely spread out, so rural pages like this one. After 1876, you’ll see either entire pages or entire articles, and it must be the same – exactly the same in every respect – to qualify as Ready Print. After 1910, it’s going to become more different to spot it. A lot of technology changes made it easier for Ready Print to sort of sneak in to local content. It’s harder to identify there. So, the next steps for me, I’m going to keep looking for this stuff. Start with known Ready Print, and search Chronicling America to retrieve other titles that have it. Start with known Ready Print clients, and I do have the client list for A.N. Kellogg for 1876, and another one is published in Rowl in- Roll? Rowl? 1889. I have one minute? Okay. But that’s going to be a long and cumbersome process to find those Ready Print clients in Chronicling America selections especially, as Chronicling America selections change and grow. And then, the other option is for me, or anyone to collect as many of these Ready Print client lists that they can, and then publish them for all NDNP, so that you can look up your own paper and see what they- what you’re going to find printed in it. And that’s it. [ applause ] Oh wow, that light is bright. Sorry. So hello everybody; I am going to move through this pretty quickly because I am hungry, and I’m sure you are all too. The Historic Oregon Newspapers, our website – our local site – has been live since 2009, or sorry. That’s incorrect, totally. We started our first NDNP grant in 2009, we went live with the site in 2011, so it’s been live for one year. In 2009, we got our first NEH NDNP award. We also got two other grants – one from the Oregon Cultural Trust, and one from our local LSTA program. We utilized these grants as complements to the NEH grant, so we bought server infrastructure, we set up an education program, and we also tried to do titles that wouldn’t necessarily be prime candidates for inclusion in NEH. So, when we went live in 2011, we had 280,000 pages, 100,000 of which were from that first NEH round, and the rest were funded elsewhere. Our site’s statistics, these look very familiar, and I’m sure as you saw from Deb’s, the best thing, I think pointing out this, compared to all of our other digital collections, is really how long people spend on the site. So, you can see here, we have over ten minutes, which is very much like the Chronicling America. I’m not going to spend too much time on these statistics because we’re running short, but it’s very interesting, I think, too to see how they fit our goals. So, for example, the visits here is 56,000 – a little over for the full year – but, when we did our first press release, we got a huge launch. This doesn’t include that. We have great return, getting about 60% returning, but we want to get more new people coming, and so we’re going to try and switch that a little bit more too. We have people from over 140 countries, a large percentage from the Middle East, which I thought was very interesting. In the States – and I’m using Google Analytics, I’m sure many of you are familiar with this – you can see, this is the visits view, which shows that our largest metropolitan areas are where the most visits come from, but what is actually more interesting to me is the average time spent on the site per region. And you can see there, are towns appearing here, such as New Powder in Eastern Oregon, that I had no idea existed, Selma, and Burns is the one kind of over in the right, which is the only one where we actually have content from that particular city. The other ones, we do not have newspapers from those cities. So, how are they getting to us? We actually have just around half coming from search engines, the rest, direct and referrals. I wanted to show where the referrals were coming from: the top one was our own library website, which totally befuddles me, because if you go to our website, good luck finding our digital collections. [ laughter ] In fact, I’ll buy anyone a beer who can find it. Wikipedia, though, is huge; Facebook here, thanks to the folks at UPenn, the folks in the serials publications division at the Library of Congress, that’s where most of those hits are coming from. There are no hits coming from Chronicling America, because we have not put those 856 links in the directory, which is one of the reasons I am a proponent of that. All right our- they may be there now, yeah. Our top newspaper titles, of course, vary and it’s hard to tell because we keep adding more content. I wanted to show this primarily – the East Oregonian is one that we have both in NDNP and some content outside. The second half, we had the Oregonian and the New Northwest. The New Northwest is our suffragist paper, it is also in NDNP; the person who edited that, Abigail Scott Dunaway, her brother Harvey Scott edited the Oregonian. They did not like each other and they wrote back and forth in the paper. It’s a great sibling rivalry… Let me close this. The other important thing, though, that I wanted to show from this is the Oregonian is available, of course, for purchase. The largest county library, Multnomah County Library, did purchase this for all of the people who have library cards, and it is still one of our top titles and our top requesters. And, it just shows, so many people in this state can’t have access to it. They’re- none of the universities in the entire state can afford that package, to purchase the Oregonian, so the only source is through Multnomah County Library. Our top historic essay is the mail- the Medford Mail Tribune, which is also in NEH, a Chronicling America site. So, I thought that was interesting, it appears on both sites. I’m not really sure what this tells us, but I like the idea of seeing how the top essay compares with the content. A little bit just going on our other pages on the site – this is a- shows me where I have a lot of work to do. So, you can kind of see the- I called the tab “History” where we include the essays because first we tested essays and no one knew what it was. Well, I don’t know what people are looking for on the history site, but evidently it’s not essays, because as you can see they’re not spending a whole lot of time, and on the sub-pages on that site either. The K-12, I would say an embarrassing bounce rate, but if you go to that page, you’ll see why. We got some nice grant funds to have some lesson plans, we had the lessons plans and that’s there, but that’s all that’s there. And I think, really what we need to do is focus on what do we need to do in addition to lesson plans, how can we make that page attractive? Do we even do that? Do we instead try to use something more like sending things through EDSITEment or some other Gateway? The K-12 section has always been kind of a bane in my existence in trying to figure out how to make that more important, because history is not a required course past the 4th grade. The Help Menu here, you can also see has a huge bounce rate, so we need to work on that and I want to try putting some of Ohio’s videos in there. Let me just go on, here, to Sheila. So, we recently did- we sent out a survey to, academic and public reference librarians in Oregon to try to find out, who is using the site and how it’s being used, in addition to just the statistics. So, some of the questions we asked on the survey were to try to find out how familiar the librarians were with the site, how often they referred patrons to the site, and what types of patrons they referred to the site. And also, to see if any of the librarians knew of our site being used specifically in either K-12 or college level classrooms, and also, if they were interested in learning more. The responses that we got were pretty- we got about 50 responses total, so all of the librarians that responded had at least heard of the site and said that they were familiar with it, and then some were slightly familiar, and some were more familiar, but that was good. However, all of the respondents – or the majority – only referred patrons to the site “less than once a week”, so you can see this little pie chart kind of describes visually how often they were referring patrons to the site. Also, the general audience seemed to be mostly just members of the public and genealogical researchers and historians. Not so much college students or educators; you can see that that’s lacking. And also, the site, they said, was mostly used for general historic questions and genealogical research. Some of the respondents did add little comments saying that they had referred patrons to it for Sasquatch sightings and also Friday the 13th issues, which I thought was interesting. But, as far as use in actual classrooms, none of the respondents had any specific details about our site being used specifically in any classrooms. Although, we did have one person that said, they knew it was being used in the local high school, but they didn’t know exactly what class, and also, that several K-12 teachers seemed excited about it, but they didn’t know if they were specifically using it in their classrooms. And also, many of the respondents were interested in learning more, and they said that either a video tutorial or written how-to instructions would be the best way for us to communicate that information. And, we also did get some additional feedback on the survey apart from our multiple choice questions. Basically, just that most of them said they would use the site more or refer more patrons to it if content from their specific town was included on our site, which makes sense. And then, also, we just had a lot of praise for the resource – how much they appreciated it, and then, they were also very happy that we reminded them of the resource with our survey, and said that they would start trying to use it more often. So, not only was the survey a good tool for us to assess the usage of our site, but also to kind of remind librarians that it’s a great tool that they can use. So, I kind of mentioned some of the things that we were looking at in the data, and I just want to add, you know, the real goal of this presentation was, you know, people are asking us for data all the time, and by hosting your content, this is an excellent way to get more funding, it’s an excellent way to figure out where your holes are, and I also want to promote the Library of Congress’ newspaper viewer, because that open source tool, which does not take too long to install – I think it took 20 minutes when we did it at the code for lib session – was an excellent way for us to get things up quickly by the time we finished our first round of NEH funding, so thank you. Okay, my talk is a little different, I think, than the previous. I’m not going to talk about what we are doing or have done, but rather, offer some modest suggestions for where we might go from here, outside of NDNP specifically. And, I realize some of these suggestions might go against the grain of NDNP, so hopefully they’ll be taken with a sense of open-mindedness and adventure. So, I titled this talk A “Good Enough” METS/ALTO Compiler, but then, as I started to think about after I had submitted my suggestion, I started to think, borrowing from the TEI community, the Text Encoding Initiative. I started to think of it more as what they call “rough cut” METS/ALTO, and so I’ll get to that in a second. But first, let me give you the impetus for this talk. The CBSR routinely gets calls from libraries from around the state that want to digitize their local titles; probably 3 dozen in the last couple years. And, inevitably, these talks fall apart when we start talking about pricing. I tell them it’s going to be 60 cents to a dollar per page; sorry, all. That’s still our price when we consider staffing. And they realize they don’t have 20-50 thousand dollars to see this project through to the end. So, I thought for some time, there has to be a cheaper way to do this, to produce some data that’s usable. Not perfect, but useable in the CDNC, and I get the content in. The, the alternative, I’m afraid is one of two things: these libraries are either going to go with commercial vendors and their data’s going to get locked behind pay-walls, or they’re going to go out and do nonstandard data, which is what happened with the Whittier Public Library. They contacted us earlier this summer, they had gone with a company in Maryland that produced issue-level PDF’s, and now they were looking to get those in the CDNC. Of course, they don’t have any money to do this. and so, I started to think – what can we do for Whittier? Let me just show you, this is what it- what their data looks like. So, they’ve got about 40 years, and this is one. You can see they’ve- it’s issue-level PDF, or yeah. Issue-level PDF’s; not particularly accessible. So, I thought, we have this born- this METS/ALTO Compiler that D.L. Consulting created for us for our Born Digital Project. Let’s run this issue-level PDF through it and see what happens. We also have a Born Digital profile for Doc Works, I thought, let’s run it through there and see what happens. And so, I thought you’d be interested to see the results. So, let me, very quickly, this is the, the METS/ALTO Compiler that we have, that you can see there’s the Whittier news. If you want to add a document – we have five that we’ve uploaded – you can see, it collects basic metadata; the date, the edition, the specific title and there’s several for this one. And then, once you upload it, what you get is something like this. So, it’s broken down to the issue-level PDF into it’s component pages. You can move pages around, and once you’re done, you hit download – I won’t do it here – it processes the PDF’s and then you can ingest them into your system. So, I had our IT administrator ingest both samples from the METS/ALTO Compiler and from Doc Works, and I thought I’d show them here. This 1910 issue comes from, the- the METS/ALTO Compiler. You can see it’s done a decent job of OCR. Actually, I should say the OCR engine- this compiler does not have an OCR engine attached to it. This comes from the company, and they’ve done a decent job of OCR-ing. I don’t think it’d be particularly difficult to hook in Tesseract, or Abbyy FineReader or something, but it’s not there yet. What’s more interesting, though, to me is if you look at the red line on the right hand side, you can see that it’s essentially red across the page; it hasn’t done any kind of zoning. Which, for future curatorial purposes, I think, is a major hurdle. It’s a flaw, or shortcoming right now with the software. Let me show you what Doc Works did. So, if we- Doc Works does a much better job of OCR-ing, as you would expect. It has Abbyy Fine- it’s using Abbyy FineReader. It’s also done a slightly better job of zoning; here it loses it, it merges columns together. But, in general, it’s identified columns. So, I started to think, what would be a basic, a good enough, METS/ALTO? And I think on the METS side, you’d want the date, the exact title, the volume and the issue number, if present, and then logical page numbers. Obviously, it’s not going to pick up supplements and things like that that start re-paginating. And then, the ALTO, I think you’d want acceptable levels of OCR, and then whatever one defines as an acceptable level of zoning for future curatorial purposes. So, that’s kind of “good enough” METS/ALTO, and let me briefly just- hopefully I have enough time to talk about what I- how I think rough cut might differ from that. We tend to think of creating METS/ALTO as something we created, or our vendor creates it for us, we ingest in into our systems, and unless we have data corruption, we largely forget about the data. But, if you think of METS/ALTO as a series of specificities, starting with rough cut, and then maybe moving to page level where you have more detail about the issue, and then various levels of segmentation. So the CDNC kind of does basic segmentation, Australia does a little bit more, and Singapore actually gets into access rights for articles that are in copyright. You can see there’s this kind of continuum, and you might think of the creation of METS/ALTO as kind of an iterative process. So, something like Whittier might start out with rough cut, and then issues within Whittier slowly move down that continuum of being continually refined. Obviously, this kind of approach to METS/ALTO requires tools that don’t exist right now. I can imagine curatorial tools for library staff to do some of this segmentating, user curatorial tools to do user text correction, zoning, tagging, things like that, and what Andrew and the person from Virginia were talking about yesterday, some of these programmatic curatorial tools getting applied to the data. So, in one sense, I guess I’m suggesting we might start to think about shifting some of our resources, and potentially identifying new resources away from just the creation of data to these kind of tools that will allow us to curate it and continue to improve it once we’ve ingested it into our systems. The advantages, I think, for us, even if we just stop at something – a rough cut METS/ALTO – I think the advantage is two-fold. One, it gets more data in, more cheaply, and it helps us promote the standard METS/ALTO that we’ve worked so hard to develop and promote ourselves. So, if you just do that, I think you have two things going for you. If you go beyond that, and start developing some of these tools, I think, again, two advantages. One, you develop new and meaningful relationships with your users, and so, since we instituted user text correction, the amount of time users are spending on the site has gone up. The average amount of time, and I think similarly with these other kinds of user curatorial tools. And secondly, from probably our own selfish prospective, we ensure the continued refinement of our data once it’s in our archives. It’s just not sitting there, hopefully, the accessibility and refinement continues to improve. So, I present this METS/ALTO compiler not as something we can use out of the box, although I think with a little bit of development, something like it probably could be- be produced. In fact, I know North Carolina has been experimenting with Abbyy’s server that can- one minute. Okay. They’ve been using Abbyy FineServer that can output ALTO. So I think we could get there, but I present this more generally as kind of a model that we might consider for future METS/ALTO creation and curation. And hopefully I came here and peaked your interest and gave you some thought for how we might do things in the future. Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *