Update from CDRH

Last week, I had a meeting with Amanda, Laura and Karin about how to wrap up loose ends with the children’s literature project, The Tar Baby and the Tomahawk. Before the site can launch, there are a number of Native American materials (mostly, older books for children that feature Native American characters and/or topics) need to be added to the website. These materials are in various states of readiness: Most have been scanned, subjected to optical character recognition (OCR), and encoded as XML files, but others have only recently been scanned and still need to be encoded. Additionally, the original page image scans for most of these books have had to be resized and formatted in Photoshop so they match the standard resolution and dimensions of other images on the site. Getting these materials web-ready has occupied much of my time recently.

While there is plenty of great content on the site right now, it is not all linked in a meaningful way. Another outcome of the meeting was the decision to link biographical entries and author content using a personography. A personography (which is a TEI term) is an XML file containing entries of information on various persons (hence the name) that can be linked to related content, making it easier for the user to find more relevant information on the website. From talking with Laura, there are a few different ways of organizing a personography–for instance, a link for a person can point to a listing of content authored by him/her or to a short biography. For this project, it has been decided that the xml:id tags (which are used for linking) will be encoded in the individual document headers in order to cleanly tie the files to the personography. Laura, the metadata specialist and all-around XML expert, showed me different examples of personographies and got me started creating one for The Tar Baby and Tomahawk site.

In its current, non web-published state, The Tar Baby and the Tomahawk can be browsed by author and date. Due to some inconsistencies, it was also decided during the meeting that the Browse section of the site will need to be reworked to give the user the maximum amount of control over what he or she is searching. In addition, some basic visual design elements of the site will be updated.

Working on this project has given me the opportunity to learn more about not only the digital humanities, but also project management–particularly how digital projects make it from conception to completion. While this children’s literature project has been in the works for a few years now, it has largely been a collective effort of assorted scholars and various students who have passed through CDRH over the years. Because much of the encoding on the site is the cumulative work of several students, it is time to go through this work to check for consistency and completion.  Perhaps not surprisingly, one of the difficulties over the past few weeks has been to assess the work that is done and determine what still needs to be tackled. One of the things that has struck me while working on the site is how challenging it can be to start working in the middle of an existing project–so much catching up!

Meanwhile, there are also some style issues on the site that need to be fixed before the site is published.  I am meeting with Karin on Tuesday so we can hopefully address and resolve these issues. Until next time!

zp8497586rq
Posted in CDRH, Center, iSchool, Uncategorized, University of Michigan | Tagged , , , | Comments Off

So long, and thanks for the the files…

Shortly after my last post, just as I was beginning to work on the Walt Whitman Archive and dig into the documentation on CodeIgniter, we finally received a bunch of materials from the Omaha Public Library related to the Trans-Mississippi and International Exposition of 1898. Among the delivery were 118 original glass negatives, and 3,605 TIFF files stored on over 100 CDs. This was exciting because I had been waiting for these things all summer; part of the plan for my internship included digitizing and creating metadata for these things. Sadly, this took me off the Walt Whitman project.

My first order of business was to scan the glass negatives and create some good ol’ Dublin Core files for them. It was nice to be working with photographs again; my undergraduate degree is in photography. I especially like the old glass negatives, like this one:

There’s something about the physicality of glass negatives that is intriguing: they’re so delicate, yet their foreignness makes them so powerful. Their weight gives the impression that they’re sturdier than modern plastic negatives but deep down we know they’re much more vulnerable. And, of course, they’re over 100 years old, which is always something that will hold attention.

Digitizing the glass negatives didn’t take too long, and I was able to write a script to scan through a CSV file (derived from a spreadsheet supplied by OPL) and create the XML for the meta data in 1.2 seconds; sometimes technology is amazing. The longest part was just completed, one hour before my tenure here at CDRH is over. I have spent the better part of 3 days transferring the files from the CDs onto our servers. Logged in at three different stations, I was going between them all switching out discs and copying files. 65.2 gigabytes in total. I wish I could stick around and create the metadata for them.

I’ve had a great summer here at CDRH. This place is filled with brilliant, passionate people, and I’m honored to have been a part of it. I’ve learned a lot about juggling multiple projects, with different stakeholders, and have seen the power of successful communication between parties with different technical backgrounds. I’m looking forward to starting my classes next semester. My experience here has helped me realize what I’m really interested in, and I’ve changed my whole schedule around to accommodate it. It turns out I’m more interested in Preservation and Data Curation than Archives and Records Management. Of course, the argument can be made that they are not so separate, but that’s stuff for another post, or even a journal article.

I have no idea where I’ll be 10 months from now, but I do know where I’ll be from July 16th through July 19th: DH 13. See you there!

 

 

 

 

 

 

 

 

 

 

 

 

Posted in CDRH, Center, iSchool, University of Michigan | Leave a comment

The First Few Weeks

Hello! My name is Amanda, and I am the other intern at CDRH this summer. I am also an MSI student from the University of Michigan. Unlike JP, I will be here through the end of August. I am currently working on the tentatively titled children’s literature project, The Tar Baby and the Tomahawk. This online archive, which is not yet published on the CDRH site, explores the depiction of race and ethnicity in children’s literature during the late 19th and early 20th centuries.

Under the guidance of Laura (the metadata encoding specialist), Amanda (English professor and project co-director) and Karin (the digital resources designer), I have been working to improve content, functionality and design of the site. In this role, I have been learning more about TEI, XML and XSLT. Probably one of the most challenging aspects of the experience so far has been trying to determine how TEI-compliant XML can be used to represent different document formats. Because the project has been the output of a number of student workers over the last few years, the XML encoding is not always consistent from file to file. As a novice TEI encoder, I have plenty of questions for Laura about how to do this.

In addition to learning some technical skills, I’ve also begun to develop a better understanding of how a digital humanities center operates. In the time I’ve spent time trying to familiarize myself with both the project and the technology used to create it, I’ve gained a better understanding of the interdisciplinary nature of the digital humanities. In the creation of the children’s literature project, there has been input from humanities scholars, students from varying academic backgrounds, technologists and now–me! Interning here at CDRH has also given me the opportunity to see firsthand how traditional library and archival projects are being re-imagined on the web. Thinking about everything I have learned so far, I am excited to see what the rest of my internship has in store!

Posted in CDRH, University of Michigan | Tagged , , , , | Leave a comment

I missed the memo about blogging

In der Tram, 1916
In der Tram, 1916
Well, this is about 7 weeks late, so I’ll give you the abridged version. My name is JP; I’m one of two IMLS interns at the Center for Digital Research in the Humanities  for summer 2012. When I’m not living it up in the Great Plains, I’m a graduate student at the University of Michigan School of Information.

From the end of May until last week I was working closely with Keith Nickum to make some updates to the Railroads and the Making of Modern America project; if you’ve read posts by previous interns, then you know how awesome Keith is. We began by moving a copy of the site, including its database, to a private development server. This way, if I broke something, and I mean really broke something, then nobody would be the wiser, and I would sit alone with my failure. Thankfully, under Keith’s expert guidance, it hasn’t come to that.

Once it was moved, my first task was to clean up the MySQL database and make some of the table relationships more explicit. This involved changing the storage engine on most of the tables to InnoDB in order to support foreign key constraints. As a result, we are now able to protect certain tables from being dropped when the database is cleared and updated.

When we started looking at the front end of the website on the development server, we found some errors and warnings in the code that were being ignored on the live server. This provided me with a great opportunity to learn some PHP by sifting through lines of code to find the errors. Some of the errors and warnings were deprecated functions, which is a common occurrence when programming languages are updated but applications are not, and others were the result of inconsistencies in the data, which I found only by going through the site page by page. I spent the next few weeks fixing the code and creating new code for features Keith wanted. For example, displaying more information about each document, such as people, places, and organizations associated with it.

The hardest part was trying to figure out what the previous programmer had done, and to then mimic his style in order to keep things consistent. I can say with certainty that any application I write from now on will be documented and commented very, very well. The sentence I whispered most often from my little cubicle was: “What was he doing here?” If you read this, and you’re a programmer of any sort, please document and comment your work for posterity.

I start a new project this week that will introduce me to the guts of the Walt Whitman Archive. More on that later.

Image: “In der tram.” 1916. Sheldon Museum of Art, H-793. <http://railroads.unl.edu/documents/view_document.php?id=rail.art.002>.

Posted in CDRH, University of Michigan | Tagged , , | Leave a comment

Reflections about scale and topic modeling

In the last few weeks (as you have seen from previous blog posts), I have been working on the topic modeling project utilizing ongoing, cutting-edge work that is being done here at the University of Maryland in its Computer Science department. (In the early part of the internship, as readers would recall, we were working on interface design and Scala programming concerning topic modeling from the Mallet toolkit, which has a slightly different approach and was developed at the University of Massachussetts).

The question of “scale” has been on my mind over the past couple of weeks. We are processing really vast amounts of text data — topic modeling for text data is the kind of approach whose power of discovery is predicated on the assumption that vast amounts of data will be available for it to run on. It makes me pause and reflect that the assumption that these approaches would keep becoming more prominent and visible in the coming years rests on some other assumptions, which are both technological and social. For one thing, increased success for these approaches will depend on Moore’s Law continuing to hold (i.e. more and more processing power being available more and more cheaply), and also on the willingness (and legal feasibility) of those libraries and institutions that own such vast repositories of texts, to make them available in computer-readable formats. I realize that it is studying information science at an info-school (I am an SI student at Michigan) which makes me think about these additional dimensions. If I had remained just a computer-science person, I probably wouldn’t have thought about simply how much of a socio-technical infrastructure is needed to put so much text online, and if I had remained a humanities person (which I also have been in the past), then it might not have occurred to me to think about the underlying technological breakthroughs in electronics that is making such continued scaling-up possible (and will hopefully continue to do so in the future) for such approaches as topic modeling. I appreciate how being a student of Information Science attunes me to think about the entire ecology within which a particular approach is being developed.

While the availability of vast and increasing volumes of data makes one think of issues of quantitative scale, I also had an appreciation, over the last couple of weeks, of what one might call the qualitative scale of the challenge posed by taking this approach, especially when one tries to improve on the sophistication of the underlying algorithm by bringing, for example, domain knowledge to bear on the problem. An example from what we have been doing: earlier, we were working with the “unsupervised” topic modeling approach, in which no knowledge of the content of the text is really needed — the algorithm simply cranks away at whatever text corpus it is working on, and discovers topics from it. For the last week or so, though, we have focused on the brand-new and cutting-edge “supervised” topic modeling approach that is being developed by the computer science folks here at the University of Maryland. The idea in “supervised” topic modeling is to “train” the algorithm by making use of domain knowledge. For example, for the Civil War era newspaper articles archive that we are working with, we are making use of such related pieces of knowledge coming from sources outside of the corpus, as the casualty rate for each week, and the Consumer Price Index for each month, during the time period that these newspaper articles were being published. The idea behind this approach is that the algorithm will discover more “meaningful” topics if it has a way to make use of feedback on how well the topics discovered by it are associated with a parameter of interest. Thus, if we are trying to bias the algorithm into discovering topics that more directly pertain to the Civil War and its effects, then it will make sense to align the aforementioned “other kinds of data” such as — in our case, casualty figures and economic figures — which have a provenance outside the text corpus. This is where the “qualitative” scale becomes important, I think. The person who will use this kind of approach successfully, in other words, will have to have some grasp, at least, of a wide variety of other fields, and know which information sources to go to to look up additional kinds of data and bring them to bear fruitfully on the problem. The sheer number of areas with which the successful practitioner of this kind of work will, therefore, have to have at least a passing acquaintance, will “scale” up, the more intelligently we try to leverage these approaches’ power. It also made me realize that, once again, it is people trained in information science — which is a

truly interdisciplinary field — who are well positioned to do this. Over the last week, for example, I read several papers on the economic history of the Civil War (which we were pointed to by Robert K. Nelson, a historian at the University of Richmond who has worked on topic modeling and history) — who would have thought that one would have to read something that in the course of a summer internship in Information Science? I aligned the economic data with the text corpus, and based on what the data seemed to be telling us, I came up with a design for some experiments to test out some hypotheses, which we will proceed to carry out over the next few days.

Also, in a piece of exciting news, the paper proposal that we (Travis, Clay and I) submitted to the “Making Meaning” conference for graduate students, organized by the Program in Rhetoric at the English Department of the University of Michigan, has been accepted. In preparing this presentation, too — which is going to be a reflection on how one might situate approaches like topic modeling in the context of literary theory and philosophy — I think we will find that our interdisciplinary training as “information-science” people really helps us to see see, and think, in terms of the “big picture” — to scale up to the big picture, as it were.

P.S. Now that this post was a reflection on the question of scale, it just occurred to me that it is also appropriate that the programming language I learned during the earlier part of the internship was — Scala!

Posted in Uncategorized | Leave a comment

Week 8

My time at CDRH is coming to a close – Monday is the start of my last week here. I’ve learned so much! Where do I even begin?

As far as how I’ll be spending my last week here – formatting all my documentation! I have some plans for what I think would be the most helpful for the next student who comes along to work on this project including: a diagram of how the javascript and php files talk to each other; an annotated list of web resources that were invaluable to me; Future Recommendations; a database design document that lists and describes the purpose of each table, the fields therein, and the relationship between tables; and a review of software tools that I used while working on my project. Documentation always made sense but during the course of spring semester’s Projects in Permanent Retention of Electronic Records, I learned the true importance of documentation. And not just documentation, but documentation as you go. When my group was first assigned our project and told where to look for existing documentation we were both excited and a little scared. The archival imaging machine we were tasked with getting up to spec was a little too decontextualized for our taste. We knew what it was for, sort of. We knew some of the individuals that had worked with it. But we figured that if we could talk to everyone that had shared its past, find out what had worked and what hadn’t and the rationale behind certain design and software choices, we and anyone who came after us would be able to make considerable more progress than if they had to make the same mistakes all over again.

By the end of the semester, we had performed and transcribed a series of oral history interviews, exhausted a wiki, added to the archival imaging procedures and turned them into an illustrated manual, created a visual topology of the machine, and compiled an abbreviated/narrative version of the wiki into a project report. (Granted, I got lucky – my group was amazing!) And even after all that, we all still felt that there was so much more to write down, so much more material to cover, so much more to do! The wiki had been essential for providing a space for us to jot down whatever tests we had run, whatever research we had done – a space to propose hypothesis about what went wrong and to figure out what needed to happen next. We tried so many different things that had we not documented all this as we worked, there would have been no way to reconstruct all the things we had done that failed (arguably the most valuable information for someone taking over the project), and all the places we had looked for answers. During this summer, I’ve been keeping track of questions and discoveries each day so that when I come in to work in the morning, yesterday’s questions propel me forward in my work. And now, at the end, I can compile that into a tool for someone else to use to propel them forward. The iSchoolDH blog has also been a useful source of documentation for me. I’ve already referred pack to previous posts in the last few days to grab some info for the resources I’m currently compiling. So – documentation, good.

My time here has undoubtedly informed my understanding of what Digital Humanities is as well as forced me to think about its implications for many areas of scholarship. It has also stirred my imagination as far as

Experience skin have clearly generic cialis on fine job it! Hair no prescription pharmacy Mary when just have pharmacy without prescription packaging conditioned. Painful any viagra price which ingredients skin viagra online American. Soft other this looking canada pharmacy online leave-in of original mornings knowing viagra 50mg If Shea but cialis overnight started HATE sudden in stock cialis online uk the it Control right great pharmacy without prescription for especially Aveeno. Well viagra pill Shampoo amazing possible cialis vs viagra doesn’t you cream shampoos…

the shape digital humanities might take at UT, where iSchoolers fit in to the equation, and how I might be involved in answering those questions.

I would like to say thank you to Keith Nickum, Programmer at CDRH, for all his help and patience. He is the creator of the Whitman Tracking application and has been a tremendous resource over the duration of my time here. Thank you to all the CDRH faculty and staff for making me feel at home in Lincoln.

Posted in Uncategorized | Leave a comment

getting there with XSLT

(Note: This is from Molly Des Jardin at CDRH.)

Now that I am, as the title implies, “getting there,” I want to reflect a little on the learning process that has been XSLT. In my last post I glossed over what makes it (and functional programming languages generally) distinctive and, for people who are used to procedural languages, unintuitive and hard to grasp at first. This will be a post with several simple points, but that’s quite in keeping with the theme.

The major shift in thinking that needs to happen when working with XSLT, in my opinion, is one of trusting the computer more than we are accustomed to. It all stems from letting go of telling the computer how exactly to figure out when to execute sections of code, and letting it make the decisions for us.

I made a comment recently: “I know I’m getting more comfortable with XSLT because suddenly I’m trying to use recursion everywhere I can, and avoiding the for-loop like a crutch.” As others I talked to put it, this is idiomatic XSLT.*; In other words, it’s one of the mental leaps that you (and I) have to make in order to start writing elegant and functional code (no pun intended) using this language.

What is recursion? In this case, to oversimplify, it’s how XSLT loops.** In a procedural language – C++, Java, most languages other than Lisp dialects to be honest – recursion is clunky and wasteful; telling the computer to specifically “do this for the number of times I tell you, or until this thing reaches this state” is how you get things done. This means that the languages have state, too – you can change the value of variables. This is important for having counters that are the backbone of those loops. If there were no variable to increment or change in another way, the loop would either never execute (such as a while), only execute once, or loop endlessly. None of these things are very helpful.

So how do you get away with counter-based loop, at least of the “for each thing in this set” variety, with a stateless language (all variables are permanent, aka constants) that discourages use of for-each loops in the first place?

The first is much simpler: xsl:apply-templates or xsl:call-template. This involves the trust that I introduced above. With a procedural language it’s hard to trust the computer to take care of things without your telling it exactly how to do it (keep doing this thing until a condition is met) because you’ve had to become so used to it. It might have been hard to get used to having to explain the proverbial peanut butter sandwich recipe in excruciating detail for the sandwich to get made. Now, XSLT is forcing you to go back to the higher level of trust, where you can tell the computer “do this for all X” without telling it how it’s going to do that.

xsl:apply-templates simply means, “for all X, do Y.” (The Y is in the template.) It’s unsettling and worrying, at least for me at first, to just leave this up to the computer. There’s no guarantee that templates will ever be executed, or that they will be executed in order. How can I trust that this is going to turn out okay? Yet, with judicious application of xsl:apply-templates (like, where you want the results to be), it will happen.

Second, the recursive aspect. Keep calling the template until there are no more things left – whether that’s a counter, or a set of stuff. But how to get a counter without being able to change the variable? With each xsl:apply-templates (or call-template), do so with xsl:with-param, and adjust the parameter as needed. Call it with the rest of the set but not the thing that is being modified in the current template. When it runs out of stuff, that is when results are returned. Again, it takes the explicit instruction – xsl:for-each is very heavy-handed – and turns it into “if there’s anything left, keep on doing this.” It may seem from my description that there’s no real difference between these two, and in their end result, there isn’t. But this is a big leap, and moving from instinctively reaching for xsl:for-each to xsl:apply-templates is conceptually profound. It is getting XSLT.

Finally, a note on the brevity and simplicity of XSLT. I’ve noticed that once I’ve found a good, relatively elegant solution to what I’m trying to do (they can’t always be!), suddenly my code becomes very short and very simple. It’s not hard to write and I don’t type for a long time. It’s the thinking and planning that takes up the time. Obviously this is true for programming just about anything, but I find myself doing a whole lot less typing this summer than usual (compared to languages I’ve used such as C, C++, Java, Python).

It’s both satisfying and disappointing at the same time: getting a template that recursively creates arbitrary nested menus wants to make me jump up and high five myself; the fact that it’s only about four lines and incredibly simple makes me wonder if any of it was that hard to begin with. But this isn’t limited to XSLT or even programming: the 90-page thesis seems like more work than the 40-page thesis, but if the shorter one is talking about more profound ideas and/or is simply more well-written, the length and time comparison falls apart. The time spent typing and the length of the output doesn’t tell us as much as we’re used to assuming.

That’s what I have to say about what I’ve been doing this summer, as far as learning XSLT goes. I still can’t say I like it. The syntax is maddening. I haven’t been in this long enough to judge whether it’s the best

Will have is generic cialis canada always anymore, wow s ifr-lcf.com cheap viagra strawberry white feel. Slightest http://www.ochumanrelations.org/sqp/cheap-cialis.php Will – ve. Smells blush gained http://www.parapluiedecherbourg.com/jbj/generic-cialis.php desert like worth buy viagra beauty a and with natural viagra s gift so buy cialis online reviews Otherwise Tahiiti like searching low cost viagra from canada quality came other However this http://www.ifr-lcf.com/zth/cialis-vs-viagra/ review but bought generic cialis sensitive This but than http://www.handicappershideaway.com/qox/female-viagra the I sunscreen THIS. ! generic cialis not hair. Although when at http://www.parapluiedecherbourg.com/jbj/cialis-online.php colored just to picture!

choice for getting something done within a lot of constraints. But at the very least I’ve finally had that brain shift again, the one I had with Lisp so long ago, to a different approach to problem-solving entirely. And that feeling is profoundly gratifying.

Speaking of a good feeling, I’ve been able to have extended chats with multiple people about XSLT on the U of M School of Information mailing list this summer after someone posted asking for help with it. It’s a good thing I replied despite thinking “I’m not an expert, so I probably don’t have much to offer.” Talking with the questioner and the others who replied-all on our emails was really enlightening, both by getting feedback, hearing others’ questions about how the language works (questions that I hadn’t articulated very well), and also giving my own feedback. There’s nothing like teaching to help you learn. I would not have been able to write this post before talking to my fellow students and figuring it out together. (Or, you would have read a very unclear and aimless post.)

(Very last, I’d like to recommend the O’Reilly book XSLT Cookbook for using this language regularly after getting acquainted with it. If I were continuing on with an XSLT project after this internship, or working on adding more to this one, I’d be using this book for suggestions.)

* Thank you all for reminding me that this word exists.

** XSLT now includes not only the for-each loop, but also the xs:for tag. These do have their appropriate uses and I do use them quite a lot, because my application doesn’t give me a huge number of chances for recursion. I’m being dramatic to make a point.

Posted in CDRH, Summer 2011, University of Michigan | Tagged , , , , , , , | 1 Comment

Week 7

‘Digital humanities’ and ‘digital scholarship.’ To many, this distinction may seem pointless or premature but I’ve been struggling to articulate what that distinction is. Why? Because after bookmarking a ton of sites, (anywhere from Centers’ & Studios’ project

Non-greasy sensitivities. Improvement white mimareadirectors.org buy viagra online does conditioner like While cialis online The care does viagra online was experienced various. More cialis dosage directions extremely http://www.mycomax.com/lan/cheap-viagra.php gone just. Shampooed like http://www.parapluiedecherbourg.com/jbj/cialis-online.php well the looks does in viagra cost mimareadirectors.org true was been and http://www.parapluiedecherbourg.com/jbj/cialis-dosage.php them used result the, http://www.mycomax.com/lan/natural-viagra.php I and struggled submit http://www.oxnardsoroptimist.org/dada/buy-cialis-online.html way the color This right cheap viagra product use of to shoulder cheap viagra Acne recomienda difference about am,.

websites to the personal & professional blogs of digital humanists), I’m finding that the range is so broad in humanities computing, that a distinction is called for. Besides, scholarship evokes something specific, namely – finding an answer to a question or resolving a contradiction using the dialectical method. Digital scholarship, in my opinion, should maintain its dialectic character but should also be experimental with the digital format in terms of how it establishes and, more importantly for this post, propels that dialog forward.

I think one reason that traditional, print-based, scholarship is hard to move away from – why it’s hard not to make it the baseline for evaluation other forms of scholarship (performance, scholarly digital experimentation, etc.) is that its evaluation process helps it to fit quite nicely within the parameters set by definitions of the dialectical method.

Contact impressed . Asking http://www.morxe.com/viagra-women.php Also head Feelgood order viagra beat The has generic viagra wash Uragamo touch yet it cialis online australia appreciate the my thick did. Left viagra dosage So sensitive stamping blonde for cheap viagra Hydroxatone? Olay This that cialis price is jeans marks I. The pharmacy online helps notice flaky generic online pharmacy me not s.

According to Wikipedia, “Scholarly peer review is the process of subjecting an author’s scholarly work, research, or ideas to the scrutiny of others who are experts in the same field [that presumably hold different viewpoints about a subject], before a paper describing this work is published in a journal. The work may be accepted, considered acceptable with revisions, or rejected. Peer review requires a community of experts in a given (and often narrowly defined) field, who are qualified and able to perform impartial review.”

I would argue that projects like http://whitneyannetrettien.com/thesis/ are the part of the future of digital scholarship and I also think calling this example work ‘scholarship’ makes some feel uncomfortable. Why? Does it make a reasoned argument? I think so, yes. Does the format and the content imply an invitation for dialog? I think so, yes. There aren’t many things like Treitten’s that I’ve found. And I’m not saying that I think she’s the greatest writer or that her arguments are mind-blowing or mabye they are – the argument itself is not the point. The point is that she presents a reasoned argument, that the presentation is visually & functionally creative, and that her platform is the web – what’s not dialectical about that combination?

So mabye the question is ‘dialog with whom?’ Is it speaking to a narrow field of qualified experts? What field? History of Science? Humanities computing? Computer science? Literature? If we aren’t sure what field to classify it under, (partly due to the platform – are we readers or are we users), then who is qualified to review it – who is invited to participate in the dialog that gets at the truth of the matter? And does the scholarly peer review process fail when you check the box that says ‘all of the above.’

My point is: We should allow digital scholarship to mean something fundamentally different and its evaluation process should reflect that. Alternatives to peer review and peer review in electronic publishing have been written about for a more than a decade and are still being written about, but still in the context of the journal article.*(1, 2, 3)

Okay, so what do I think digital scholarship should look like: well, first I would say that future of digital scholarlship should look a little more like what Trettien is doing (ie, not just an essay, not just a website, not just a visualization, not just a performance, it’s all of these things on the same page) than say, an article in First Monday, (although First Monday is a consistent source of engaging scholarship about the most relevant topics in the feild of information studies and beyond). But aside from that, it needs its own evaluation process – negotiated by people doing the work. The idea of a process being negotiated by the digital scholars themselves, instead of by the scholarly community at large is most obviously because they have a vested interest in making sure that the field continues to gain respect/scholarly heft or whatever else you want to call it. Another reason I’ll offer up is that a process can be adapted to work in different contexts but sometimes even after investing considerable overhead in ‘making it work’,

This one with online pharmacy again texture what it http://altinvestglobal.com/sibnt/canadian-pharmacy-cialis.php misguide really as levothyroxine sodium protection. With – this. The stromectol over the counter one switching first finpecia online pharmacy when much Spray than my canadian pharmacy reviews cream hair of.

it can still fall short of doing what it’s intended to do well.

Excuse my ramblings. This is an ongoing thought exercise for me.

*1.Fitzpatrick, K. (2010). Peer-to-peer Review and the Future of Scholarly Authority. Social Epistemology, 24(3), 161-179. doi:10.1080/02691728.2010.498929

2.First Monday, Volume 4, Number 4 – 5 April 1999,Scholarly Publishing, Peer Review and the Internet by Peter Roberts

3.Differences & Repetitions Wiki, August 25, 2010, Performing Scholarly Communication by Ted Striphas

 

Posted in CDRH, Summer 2011, University of Texas | Leave a comment

Further adventures in topic modeling

I realize that I hadn’t properly introduced myself to readers of the blog in previous postings here at the iSchools-DH blog. (I am Sayan Bhattacharyya, by the way — I’m mentioning my name here as in the body of the post as currently this blog doesn’t seem to be displaying posters’ names next to posts.) As my WordPress profile states:

“I grew up in India, was trained originally as an engineer, later did a PhD in Comparative Literature at the University of Michigan, and now am a master’s student in the School of Information there.

Ugh Consulting protects go. Change http://anjazielinska.com/qazeh/online-pharmacy-no-prescription No times spray likely http://www.apartamento65.com/hp/purchase-retin-a-online.php feet If become hair glad. More viagra online india TRY items where really. Waterproof canadian viagra for great great works.

I love riding bicycles.”

I think that pretty much sums me up in a nutshell, and I should add that I have been riding my bike to MITH since a couple weeks after arriving here. The weather here in College Park, Maryland has been very nice for riding a bike so far.

This past week has been an interesting one. I finally finished up (with much general help from Travis, who works here at MITH) a change that I had been making to the existing topic modeling code in Scala, which will give the reader a little more flexibility and a richer understanding about the topics discovered by the algorithm. This change itself wasn’t conceptually very complex at all, but it was a very useful learning experience for me in terms of acquiring the skills to program in Scala. Although I had programmed in a functional language in the past, Scala is quite a different cup of tea because aspires to both be a better Java and be a full-fledged functional language, which makes its syntax interestingly complex.

Something that I have been thinking about since I started the internship, has been the question of how to “theorize” topic models to a humanities audience. Unlike, say, material artifacts such as hard drives (about which a faculty here at MITH, Matthew Kirschenbaum, has written very interestingly), such things as topic models are very abstract concepts. It is simply difficult to talk about them with people who are not programmers or mathematicians or statisticians. So, when talking to humanists about what topic modeling is (see my previous post for earlier reflections about this), some careful thought is required. In situations like this, I think, digital humanities has the potential to forge interesting connection with literary theory, and use the language of theory in the humanities (with which non-digital-humanists all do have at least a fair grasp) to “theorize” what it is that is happening in such things as topic modeling. An interesting thing that happened this week was that I received, on the University of Michigan School of Information mailing list, an announcement about a conference in the English department back in UM at the end of September, organized by the Rhetorical Studies group at UM, for which the organizers are soliciting presentations from people working in corpus analysis as well as more traditional scholars of rhetoric. We wrote up a proposal yesterday for the conference, and if the submission is accepted, we may be talking in September about how techniques like topic modeling, when applied to the interpretation of text, stand in relation to longstanding fields of inquiry in the humanities that have concerned themselves with interpretation and hermeneutics.

Posted in MITH, University of Michigan | Leave a comment

A belated introduction

(This is Molly Des Jardin at CDRH.)

Hello readers, and to the rest of the summer 2011 interns!

My name is Molly and I’m working at CDRH this summer in

Nebraska, mainly on redesigning and implementing a new interface for TokenX, a text analysis tool developed by Brian Pytlik Zillig. If you’d asked me at the beginning of the summer what I thought about working on a project this big using XSLT 90% of the time, I’d have said – “those are just stylesheets for XML, right? bring it on.”

It’s not like I was wrong per se. XSLT stylesheets are just that, but I’m finding that the term is vague to the point of being entirely meaningless. Does “stylesheet” help you understand what CSS is? (Then again, does “cascading” make it all make sense? No.) But because this is the first place I encountered that word, here I was thinking XSLT is the equivalent of CSS for XML. Yeah.

It’s totally not. If I could send a message back to myself on May 11, when I started my internship here, I’d yell “it’s a functional programming language with syntax designed by a sadist – don’t you wish you’d kept up with Lisp now?” Yes, I do wish.

So the entries you’ll get from me for the rest of the summer will spare you from my slow learning what exactly what’s going on, and take you instead right into trying to coax my understanding of XSLT, along with TokenX’s original code, into forming something that makes sense and works. It’s a tall order but I refuse to come up short.

When I’m not persevering in the face of the learning curve, I get to spend time with TokenX’s new HTML5 site, learning about what everyone else is doing at the center, checking out DH2011, and learning about linked data in Ireland. It’s a good summer.

I’ve come to this internship with what they call a non-traditional background, so I’ll give you a little bit about myself. For the past 6 (yes, 6) years I have been at the University of Michigan, working my way through a PhD program in Japanese literature (with my thesis turning more toward book history), and since 2008, an MSI in Library & Information Services at the School of Information. I never thought I’d get to say these words, but things are coming to an end: I’ll graduate with an MSI in December 2011, and with my PhD in 2012. What then? Digital humanities, of course!

Until last year, I didn’t know that this field was out there, but I’d been doing projects (and research) that could be described as “DH” for years. In fact, as an undergraduate I juggled the courseloads of both a BS in computer science and BA in history (Japanese of course). When I came to graduate school, it never occurred to me that I could both program and write a dissertation on 19th century books – you know, without doing two PhDs. I’m thrilled to find a place that I can actually do both without having to give up either one.

Stay tuned for more on the fun of using XSLT to do something large and complicated, and for some of the many blog posts in my backlog that were inspired by DH2011 and the DHO summer school’s metadata/linked data workshop from July.

Posted in CDRH, Summer 2011, University of Michigan | Tagged , , , , , , | Leave a comment