Group Members: Stian Rebecca, Kyungmee Lee, Minoo, Jane, Lixia
Currently, to find academic information, we have to go to multiple publishers' webpages, and then we end up with our Download directory full of "3049304934.pdf" and similarly named files. We take notes, that get lost in transit, and we don't share them with others. It's very hard to come into a new field of knowledge and figure out which are the important papers, how are papers connected etc.
What would an ideal new system for academic publications look like? How could we use PDF metadata, semantic data, XML, microformats, social networking tools and other technologies to make it much easier for researchers to find and analyze data.
Our design interest has been changed through our group brainstorming & discussion. We would like to focus on classroom language learning practice, specifically English and Cultural Literacy learning through collaborative reading. We are trying to converse different language learning tools & social media or computer-mediated communication tools to design an effective language learning environment. We suggest a learning & teaching model (or flow) in both local and global classroom contexts as well as a learning environment design.
Minoo, Kyungmee, Jane
We would like to examine the ideal one person workflow, specifically in the context of the academic researcher. Ideas that we want to look at are RSS feeds, beyond PDF, semantic markup, etc.
Stian, Lixia, Rebecca
Notes from our Etherpad
- All citations in electronic versions of articles should be hyperlinked
- There should be an easy way of downloading citation information for the article, probably on the first page. The reader should be able to select the citation format he/she wishes to see.
- There needs to be a way that user can specifiy preferences for naming the file. There should be a very systematic way of doing this. For example, authorlastname_date_tags.pdf; ziebarth_2010_museumaccessibility.pdf. Does anyone have insights about how they name their files - in a way that works?
- How about using the author's name then 6 digits for rough GPS coordinates and 6 digits for date of publication (eg. lin043079110216). This would yield a manageable filename that is only alphanumeric (~18 characters) and so supported by older or smaller operating systems such as smartphone OS and would give the three major criteria to search by: name, location, date. the name could easily be decoded into city, province, country while taking up much fewer characters and providing a totally unique identifier.
- Rather than embedding metadata in the file itself it would make more sense, as Stian mentions further down, to have the data looked up online using the identifier, making revisions to tags or bibliographic data instant and global.
- Think about how iTunes or other programs deal with MP3s, it's actually a relational database. Instead of just one name, it's a bunch of fields: Artist, Track, Album, Title, etc. The great thing about this is that I can sort by whichever field I want, if I want to see all the articles written by Stian Haklev, I click there, if I want all the articles in Scientific American, I click somewhere else. (http://www.freelancepropaganda.com/archives/MP3vPDF.pdf)
- Great article! One thing I have noticed is that when I copy a CD to iTunes the metadata is not in tact. The tracks are called 1, 2, 3, etc and are not named.
- Cool idea! I guess I'm looking for a temporary solution, but the relational database idea sounds amazing. So I guess the question could be what would be useful fields.
- For a temporary solution, I know a lot of the paper managing tools let you rename PDFs automatically, like Papers (http://mekentosj.com/papers/). However, you first have to get the citations into the system, which is a hassle.
- Here's a really nice screencast of a system that let's you manage PDFs, snippets, then insert them into Word etc. And Open source. Wonder how easy it would be to integrate for example Snippets from Kindle. http://www.sciplore.org/software/sciplore_mindmapping/
- Great Stian! Personally I don't like to use mindmapping For me, it takes a lot of time to just do mapping... here is other examples of PDF management tools: DevonThink and Scrivener.
- http://www.devon-technologies.com/products/devonthink/devonthink2.html It seems to be similar like Papers, which can be a powerful tool for managing PDF files. Additionally I like the search function of DevonThink. It is somewhat possible to search PDF file based on language as well as logical relationship.
- http://www.literatureandlatte.com/scrivener.php This is not just for managing PDF file, but also for managing all kinds of files in our computer; It seems to have all different functions to edit PDF files. And I like this tree form better than mindmapping
- Author, Date, Journal, Title of Article, Keywords, University Affiliation?, Rating system - like Amazon has?, Number of articles that cite the article? Other ideas?
- How about how much info such as how much author's work is cited, tagged or reviewed online?
- Number of articles that cite the article should probably not be embedded in the file itself, but would be something that could easily be looked up online using the unique article identifier.
- If the unique article identifier worked, that would be super! I agree that embedding the articles in a file would be impractical!
- It would be cool if you could asign hashtags to text that you highlight in a document. Then you could to a search for that topic on your computer.
- Also being able to share highlights with other people. Kindle for example tells you which passages have been highlighted the most often by others, however it only works with books bought at Amazon (not PDFs you've converted yourself), and there is no way of easily sharing annotations between friends etc (https://kindle.amazon.com/most_popular/highlights_all_time/26). Diigo is a great example of a tool used to annotate webpages, and here you can share with groups of friends, however it only works on webpages, would be great to have something that works on webpages, Word docs, PDFs, whether I'm reading them online, on my Kindle, etc.
- I agree with this. Diigo is very good example and I though somehow it might be better and easier to develop web tools or applications instead of developing PDF editing tools (At least in term of file sharing and working collaboratively on PDF files). For example, in Google Wave or Public Pad as well, we can embed the function to import from PDF file. They are providing the import function for word, HTML or RTF though, however, in the case of PDF files, it might be problematic in terms of copyright. Then we can sprit the screen and the left screen, we can import a PDF file as it is; and we can copy or drag some part of PDF file to right screen, which is the main discussion pad. Of course the certain part of the PDF file should be automatically hyper linked and we can always import the part. Or we can use the function of sticky notes of Diigo on the PDF file.
- One thing is document-centric sharing, ie. I am reading this document, who else read it, what did they think about it, where are their notes and highlights. Another would be topic-centric sharing, ie. here are all my notes about OER in China, including all the articles I read, all the snippets I tagged, etc.
- I would like to be able to see specific reader's notes and highlights. I'm sure that would be possible too.
- It would be neat if you could assign @rebecca, for instance, to text that you highlight in a document. There could be an automatically generated email that sends the document with the text you highlighted for that person. The email would only be sent when you wish to send it, and if you have more than one section of highlighted text for the person, the email would send one document, with the sections of text highlighted for that person.
- It would be great to have a select all highlighted text button
- The link above is a link to the conference "Beyond the PDF." Most of the stuff in there pertains to scientific research, but it still might be interesting to look at.
- It would be cool if your computer notified you if you already have the document on your computer and where it is when you try to download a document, even if you have renamed the file or downloaded it under a different name.
- Linked data, and semantic markup of articles. Anita De Waard from Elsevier gave a great presentation about this at KMDI, it's recorded here (click on her name): http://epresence.kmdi.utoronto.ca/1/watch/799.aspx Here's another good presentation http://indico.cern.ch/materialDisplay.py?contribId=7&sessionId=3&materialId=slides&confId=48321
- I agree! This was an excellent and inspiring presentation. Her talk centered around scientific data, which I found compelling, although I'm not sure how to apply that to other forms of research. What about qualititative research, for example? Could original field notes be equivalent to the scientific data that she refers to?
Some technical solutions
- All researchers should have a unique researcher ID (see more about this idea here: http://www.sciencedirect.com/science/article/B6T1B-4SVF3XY-7/2/6823f4d70422527b9ea15715d35b7d4e and here http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2603453/?tool=pmcentrez), so that we could easily track all of their contributions, etc. Trying to open Google Scholar for "Johnson, P." is almost useless. Who can give them a certain ID? Here are you talking about generating those certain meta-data by each individual through a particular PDF management tool or generating kind of standard rules or meta-data type in whole academic filed? In the latter case, it migt be a bit complicate; cause for sure there might be a bunch of Johnson, P or even Lee, K (my name Then we should think about one or two more higher level of categories such as specific field or school.
- All articles should have a unique ID, which makes it easy to refer to them, and let's computers create citation graphs, enable us to quickly look up the full reference, and the link to the full text. (DOI is one example of this, but it's not perfect or universal: http://www.doi.org/ http://en.wikipedia.org/wiki/Digital_object_identifier) - Reference sections in our paper might become more complicate then... we can think about APA style as well here, If we suggest different mate-data types for academic paper. And one more thing to think is that what about other PDF files, which cannot be divided into academic paper; but still PDF files.
- Use microformats (http://microformats.org/about) (such as COinS: http://en.wikipedia.org/wiki/COinS) to embed semantic citation data in webpages, so that they can be read for example by Zotero (http://www.zotero.org/) and other tools
- Has anyone tried Zotero?
- Embedding citation information as metadata in the PDF file (this is a great! article comparing academic citation management and MP3s: http://www.freelancepropaganda.com/archives/MP3vPDF.pdf)
- Publishing in XML or another semantic format instead of PDF. The idea is that instead of marking up the presentation (make this bold, put this on the right side), you markup the meaning (this is the name of an author, this is the title of a book), and you let a stylesheet determine how it should be displayed (should book names be italic, should the title be huge?). This has several benefits
- computers can more easily extract information (go get all the references in this article, and download all the articles that are referenced)
- you can easily create many different output formats from one source - including PDF, HTML, ebook, etc. The university might want a double spaced formal MA thesis, but I want something I can put on my Kindle, or even read on my cellphone – no problem
- Imagine being able to link directly not just to a document, but to a specific location within a document...
- downloadable versus regular html
- pdf is free down view
- file size is big, consumes a lot of memory, snapshot of pages, file size can be reduced
- free hand highlighting for all pdf documents versus only some
- no animation features - Jane
could you explain more what you mean by this?, Hey, which point, the last one or all? I didn't go in depth explaining but resorted to brief points, although it seems most of my points have been elaborated through your explanations added above. Let me know and I'll check back later : )
Hi Jane, I am wondering what you mean by "no animation features"
I am trying to read a PDF for KMDI1002 and it isn't allowing me to highlight or make comments. Major dislike!!
Hello, to clarify the point you were asking about : You can't watch videos on a pdf file, but you can on html