The Overly Caffeinated Librarian: September 2009

Tuesday, September 29, 2009

Illinois Web Accessibility Conference and Expo: Session 3 – Content Management Systems 1

Speaker: Christian Johansen (Plone, open source on python), Penn State University

Christian is from Weblion. Most of their work now is with content management systems. There are very active with open source communities, most notably on plone.org (and team members serve in key roles on Plone committees)

A CMS incorporates easy publishing + a workflow = explosion of content.

There are over 700 CMS systems out there right now. Most are open source. But technology doesn't matter any more. What really mater is user orientation [training?].

Their goal isn't to create/modify something for themselves, but to integrate their accessibility enhancements into projects core packages (update Drupal templates, add ARIA support) To get the maximum effect make sure to roll it in to the main distribution, not in silos in your own institution. Acceptance of your code is based on meritocracy, including an official review structure/organization.

When choosing a CMS, take the requirements from you clients and shelf them. Instead, focus on community health and vitality behind the project. Is leadership shared? Are new members helped/welcome? What is the size of that community (that is likely the best indicator).

Early on they couldn't hire any Python programmers, so they had to hire and then train their own people, a dubious proposition then, but that has likely paid for itself now (Plone work).

Your role is to make your agenda transparent. If you are competent, consistent, and collegial, you will accrue influence that can allow your requirements to be folded in to full distributions.

Most open source development operates around the bug tracking and ticketing system. You submit your enhancement as a ticket, which is reviewed by the community, accepted or rejected. If accepted, it goes in to the development and testing phase, then release in beta form (then feedback for code changes).

Christian demos the ticket process for a Plone requested change, using some screen shots.

Finally: Your participation in open source communities improves accessibility.

Speaker: Brandon Bowersox (Drupal, open source PHP) OJC technologies from Urbana.

He's done a great deal of work working for the IOWA Department for the Blind. Many of their changes are being rolled in to the Drupal 7 release about to e rolled out in early 2010.

The top layer of accessibility is the content itself – are users prompted to add accessible features when they upload/create content. Do images prompt for alt text? Are users promopted to check external sites they link to.

Then you have the structure- aesthetics, page layout, templates, themes, navigation, search, color, etc.

The lowest layer is the CMS/LMS: the editing and admin tools that come with the product, WYSIWYG editor, etc.

Web design from start to finish:

Team selection- Include someone from the campus accessibility community- even if that's just a screen reader user.
Vendor/CMS selection – request a demo copy, require an RFP that includes accesibilty checklists.
Tech planning – determine features, add-on m odules, if you are going to use ajax, flash, google anaylsics (and make sure you know how to do each bit in an accessible way)
Garphic design: things like skpi to content links, where titles will be displayed, design colors with contracst at an acceptable level, and your HTML markup and related CSS.
HTML?CSS theme- related to site wide things, like headings, form objects, ensuring linear reading order
Production – Iterative testing (which should be happening at each previous step where possible), testing the creation/editing of features
Testing – If you wait until this phase to check for accessibility, it's too late. This is testing with "real users" for bugs, for most situations
Launch and Maintenance - . Make sure to have ongoing testing, especially when new tools are added. It is vitally important to solicit user feedback in an ongoing way.

One part of their project for Iowa was to produce their RFP response as a fully accessible PDF.

Some of what they do is provide multiple options for content contribution: the WYSIWYG editor+markup, direct HTML editing, or simple text input with comments (do this or that) for other developers to follow later (say a non-sighted user posting content that needs layout work from a sighted user).

Always consider what parts of an interface can be removed to make it easier for everyone. When new modules/features are being coded/added, do the same type of evaluation to just cut the dross to start with.

They generally only use about a dozen add-ons/modules for Drupal. Although there are thousands out there, it's impossible to keep up and verify that all those modules are natively accessible (and generate accessible content).

They also provide audio snippets of audio books. The use the browser native player by default, but provide multiple other file formats for users to choose to use if that won't work for them.

Making a CMS accessible: make sure you do this for an open source project, and contribute your developments back in to the source.

They updated some ajx menus to be obvious to screen reader users. The also made form error feedback appear at the top of the page, obvious, with links to the problem areas.

Speaker: Mike Scott (OneNet, openSource .NET), Illinois Department of Human Services

About 7-8 years about, IDHS started exploring CMS systems for the intranet and web site. This lead to them (being dissatisfied with what was available) eventually building their own, which is now available via open source.

There are two parts- the part the developers work with, like CSS and templates, to make accessible. The next layer is the content layer- the content the end users make must allow non-technical end users to reliably create accessible web content.

"This is not a knowledge problem, this is a tools problem" Jon Gunderson, University of Illinois

After years of trying to train people to create accessible content, they realized this was the case, and that training alone was never going to get them to their goal of accessible web content creators/content. To get them to remember and apply all the best practices just wasn't working.

The then did a study to compare ten different authoring tools, and content for users to create as a document. Then users were tasked to make the content to see what tools inherently generated accessible content. This revealed that "it's not enough to make accessibility possible, it must be automatic." With Dreamweaver/contribute in particular, it becamse clear the tools were there, but hidden, and not part of the default authoring flow in any of those systems. This lead to their development of OneNet.

OneNet is meant to prevent bad choice to start with (removing the font tags, which aren't semantic). They removed that, and replaced it with structural mark-up options (like headings). As much as possible, all accessibility markup coding needed to happen automatically. They needed to provide a way for users to check their work- some accessibility features must be human checked (does alt text for an image exist isn't the same as making sure it was a useful description). So throughout their tool gives prompts and explanations on what accessibility features needed to be added, and walked them through it.

Right now they are using OneNet to manage about 40,000 pages (intranet and internet) with about 350 contributors.

Pages all have the edit button on them for logged in authorized users. Clicking it open the page in their web based editor. They also wanted to make sure the editor itself was as accessible as possible. There's isn't 100%, but it is pretty close. They've included lot's of keyboard shortcuts, etc. The latest version of JAWS can almost user their editor.

If they choose the wrong heading ( a possible accessibility error) they are prompted about the problem, and offered a fix the can accept or reject.

When it sees punctuation based "fake" lists, it will automatically change it to a real list as soon as a user starts editing it.

It prompts when you add links ("this link text already used on the page" and "please make sure the link text is meaningful).

When images are uploaded, a user is asked about the type of image (decorative, simple, or complex-like a chart or graph).

At the point a user in the workflow marks a page is done (for publishing or the next level of review) all the accessibility checkers are fired up sequentially (but you can choose to use them from the navigation menu as you create content as well). As they accept remediation options in the pop-up window, the actual content in the WYSIWYG is also updated.

When it sees a "fake" list, it will tell you so, and ask you to pick the correct list style (ordered, unordered, etc)

They also do some checking of links to an external document (like a pdf, to ask how they've made it accessible- it's natively accessible, the provide contact information to get an accessible version, or there is an alternate version available somewhere).

Q: So user can all ignore prompts, or is that configurable? What percent compliance with your own rules (IITAA) does this system have? The prompting isn't configurable yet. But some options are lockable (like stopping some user classes from editing html directly). Most people have been beaten in the head about rules, so most users don't ignore this. If you have an approver included in the workflow for creating and publishing content, so they'll send these problem back to the user before the will allow it to be published. Their intranet allows more dismissal/less approval at the final publishing step, so likely less compliance there.

Q: How do you convince your higher ups to contribute code modifications back to the source, and keep a focus on that? A. The reality is that code forks almost never happen, and when it does it's accompanied by fairly grave consequences, basically maintenance upkeep in the long run. If we don't contribute back, each time a major upgrade comes out to the CMS you are using, you have to recode that feature again. Rather than that, it's a better idea to get into a core distribution, because now you've got guaranteed broad community support for the feature you so desperately need that you're willing to develop it on your own.

Form the audience: At Yahoo, they've forked their own version of Drupal out because they've found they can do significantly better than the development community. What about views in Drupal being made more accessible (like views and cck). First they wanted to start with Drupal core. Then they'll work with module developer (like for views) to get them to upgrade their modules to make it more accessibility. Even though they will often use views to create a site for a client, they never actually deploy views for clients to use because it is too overwhelming.

Illinois Web Accessibility Conference and Expo: Session 3 – Adobe PDF Accessibility using Acrobat and Common Look

Speaker: Mike Scott from the Illinois Department of Human Services.

PDF was originally designed for graphic designer so there was a single files format they could send to printers that hand high fidelity between the designer page layout and the publication the print house produced. The heart of PDF is post-script- a printer language.

PDF benefits: looks the same on screen, printed, etc. Free viewer. Can be easily transmitted and shared.

PDF behind –the-scenes: PDF documents can contain three different representations of the same document inside the code. The PS based physical view that we see and print. That's what PDF started with. Then the moved on to content level, used for searching, indexing, copying/pasting. This is the content layer (or content tree). This is another layer of data separate from the physical view. That layer started to make PDF accessibility possible. But this still wasn't a very rich experience. The layout and formatting (Headings, tables, etc) weren't made available to accessible tool users. Finally a third layer was added- the tag layer. This exposed document structure and HTML-like markup to the user. This includes table header markup. It turns the plain stream of text into something semantic and understandable.

PDFs are made either by converting word processor file, or by scanning and OCRing print materials. Optionally, another step is to take it in to acrobat pro and add feature that can't be created/converted (interactive forms, for instance).

The heart of PDF accessibility is the tagging structure (document reading order) and tooltips.

The free acrobat reader has a light-weight accessibility checking tool, but it only checks for the most basic features, so don't rely on it. Acrobat Professional has a more full featured [but still not perfect] tool for checking a pdf for accessibility. More importantly, it provides a way to remediate problems on the fly (add alternate text while working on a document) or, more importantly, use the touchup tool to change the reading order of tags/elements on the page. You also often need to access the tag view (in a tree) to expand and collapse branches to see things. For instance, we look at a document and see that it has headings out of order. Then you can just choose the item through the tag tree and change the heading level. Right now it doesn't allow you to add anything beyond h3 (although acrobat understands all the way up to h6). In that case you need to edit the properties of the element as well, and then you change it (but not through the Touch up tool). You also can't add/manipulate list through the TURO – touch up reading order – tool. It works very well with simple documents, and it can be used in combination with the tag tree. But there is a catch, related to the reading order.

Because the content and tag layer are separate, they may very well be in different orders. So remediating in TURO tool changes the content layer, not the tag tree. But screen reader use the tag tree representation. It will update the content eventually if you finish an entire document, but not until then {I'm not sure I got the gist of the quite right.]

Knowing that you can't trust TURO 100%, there is another way to do a quicker check- in both Reader and Pro. In reader it's "save as text" in Pro it's the "save as accessible text" option.

This allows you to very quickly check for the most terrible problems (broken reading order) for a quick triage. One major drawback is this is just a stream of text (no headings or other structural markup is there).

Another neat feature in the free version is read out loud. It's not as good as, nor does it replace, a screen reader for PDFs. But, it does read the text in the order of "Save as [accessible] text" so you can use it to listen to a documents and identify reading order problems.

Q: Which layer does the save as accessible text come from? Both read out loud and save as accessible text draw their data from the tag tree layer.

All these free tools are useful, but they haven't given us all the tools we need to remediate a document, especially a complex document.

One major issue is that this workflow means that if the source document changes, then you must start over from the beginning.

If you are working with PDF forms in particular (not for PDF in general for this portion of the presentation), there is a better tool- Adobe LiveCycle Designer. This comes with the Pro level. This is a true form design tool that was built from the ground up (previously as another product, JetForm). It is a GUI WYSIWYG based form creation tool. Lots of the normal form elements (like title/heading) automatically becomes a tooltip- nothing extra need to be added. But there is the ability to manually add a tooltip. Anyone using PDF forms should go to designer.

Q: Does LiveCycle designer support JavaScript based form automation/checks? A. Designer does allow the use of script directly, with a rich scripting environment (two languages supported- javascript or formcalc). They haven't used it too much yet, just some basic tasks.

One other way to make it better- especially for documents that aren't forms. We can't create brochures, etc. from acrobat natively. But we have a tool that makes the testing a fixing a little easier. An Acrobat plug-in that does everything TURO should be doing but doesn't, CommonLook (~ $1000 retail, or ~$600 on subscription).

Split screen display- the physical view, and then a special version that is the content, displayed in the order as displayed in the tag view, but with some minimal semantic base display changes (headings are bigger, lists, etc).

Q. Why are the words chopped up? Because of the seemingly arbitrary chunks that adobe chunks text in to. You can go in, grab those chunks, and move them in to the right place. But you can't split the chunk- you have to move the entire bit.

Very simply point, click, select option interface. Much, much better than TURO and the tag tree remediation that comes in Acrobat Pro.

It will also provide great levels of details about complex object like tables. It also provides a checklist of possible problems and steps for correcting them in the document. It reports the problem, and when you dismiss the warning it provides the necessary tool/option to provide/fix the problem. You can also check for color-contrast (to make sure essential information isn't lost to someone that can't perceive the color).

Jon Gunderson has arranged for a free trial view for anyone at UIUC through the IHBE. This is available for about six more months. The only catch is they want feedback on how well it worked for you (to make decisions later on possible acquiring it for campus wide use). You can sign-up for a trial online (as well as some paper based forms here at the conference). The URL for the site is included in the PowerPoint, which will be posted online.

Keys to Success: Try to do as much accessibility work on the source document before converting to PDFs. iCITA offers classes about this (stop by the IT access table and talk to them to find about the free online versions of their sources). At least use the accessibility check in the free reader, preferably the better tools in Pro. Use TURO and the tags panel to fix major problems. Finally, if you do this a lot, try out CommonLook.

Q: What if it doesn't need to be a PDF? Do you suggest a different format? Yes, they often suggest to the users to make a web native format, like HTML. We explain to them that PDF is really about printing, and the move to it in a big way on line has to do with laziness (print to PDF style options). But if print output is key (like a form that needs to be filled out and printed) then a PDF version might be the best.

Q. Regarding IITAA- if the PDF is just a second version, and there is an accessible version, is that okay? Yes, the IITAA section 15.1 and 15.2 says either make the document natively accessible, or provide it in an alternative natively accessible format (HTML is usually the best choice). A link saying if you "request an accessible copy" instead of posting that version then no, you wouldn't be meeting the IITAA requirements.

Q: What about linking to the accessible version from the PDF? A. They don't, but it's a good thought.

Re: Today's postings on the UIUC Web Accessibily Expo

I apologize for any grammar and spelling errors for today's postings. Some of you requested this info asap, which means I am just taking as many notes as possible without stopping to check my grammar. I will edit these entries for grammar mistakes later this evening.

Illinois Web Accessibility Conference and Expo: Session 2 –Video Captioning

Speaker: Colleen Cook, ATLAS, UIUC

Their first step is trying to make the task of transcription easier for users. They also are trying to require that users provide a transcript through policy.

They are also trying to use some automation to automatically generate a synchronized transcript, matching speaking pauses to comma, periods, and paragraph changes.

Finally, they are focusing on marketing- educating instructors about this service being available, and what they need to do to make their content more available. They are also advertising how this helps sighted users with search- a transcript/synchronized caption makes text based searching of audio/video presentations far more user friendly.

They also encourage people to keep their audio/video to 5-10 minutes. Most web users just won't sit through a video that is longer than that. Plus, it makes it easier for them to caption and synch it. In fact, they don't charge for captioning audio/video under 10 minutes.

They are currently using Inkscribe for transcription, and Mac caption for synching the captions.

They have a web based product that allows the user to upload the transcript, and the audio. The software scans for pauses, then does it's best to synch the captions with the audio. Then they prompt the user for massaging the data.

They create CMIL and SAMI files. They also will work on closed captioning for DVDs. His PowerPoint listed a bunch of software, hopefully they'll post that. If they do I'll come back and link to it.

Their current criteria are that the auto-synching be within a ½ second of the actual audio.

They also have the option to auto-segment auio in five second increments. This is useful for more conversational speaking, Q&A, anything that isn't more like a "read"/prepared presentation. This also works for people without transcripts, as they can listen to each five second segment, and then type in the caption directly. They are working on increasing/decreasing segment length, as well as remove or insert segments. They want to have the segments hit the next "probable" speech pause instead, so the arbitrary time based cut-off doesn't clip the middle of a sentence or word. Right now the IITAA doesn't require this for course content- only publically available content. Right now, classroom content is only required to be accessible if a member of the class needs and requests it.

They have this working for both flash and Quicktime. They also want to incorporate additional audio data. Contact presenter at dpharvey@eiu.edu

Slides will be available online at http://www.eiu.edu/~cats/iwaac/

Q: Can anyone access/use this now? A: We hope so soon. They are looking for the best ways for distribute this- maybe a web app people log in and use. But maybe they'll go the downloadable app route.

Q: Data on cost per minute to do the synchronized captioning? Right now their tool is quicker than real time (working with compiled C code to be faster than say php). Definitely cheaper than any professional service, and they hope to provide this service for free. Most tools require you sit through and DOI this manually, even with a transcript, it takes at least (and usually more than) real time. 20 minute clips takes >= 20- minutes to synch. Only 1 in 10 times does it require mediation on the auto-synching.

Q: Anything there to [couldn't hear this question] A. No, this is geared primarily for speech, not video.

Q: Can it accept a file, or do you need to type it in? You can just upload it with the media file.

Speaker: Angie Anderson, Accessible Media Service, DRES, UIUC

They've only been doing this for about 2 years now, and they are focusing on IITAA compliance. They are trying to educate the faculty about the need to caption. They have run in to a few instructors who are very resistant to captioning video, and they've been using those need cases (disabled user in the class) to show instructors on why there is a needs for this [just beyond search improvement?]. They currently have one academic hourly doing most of the captioning. She is working 40 hours/ week and is constantly busy. They are currently working with the Office of Public Affairs to come up with a campus wide policy on captioning, which will help get faculty aware of what their responsibilities are. But some of them don't even know what markup format/accessibility improvements will work in their smart classrooms (just how to turn the caption option on for a DVD). It's important to keep data on how long it takes, so you can get more money from you director (which you will need later). They also use a lot of student workers. Make friends with people on campus that create a lot of video, like Colleen or Liam from ATLAS. Professors are going to provide videos in lots of different formats, created with a myriad of tools- some that are very odd and hard to deal with. They ran in to a few formats that the captioning software couldn't use. They use windows movie maker (standard on window computers). They always add a captioned by plug at the beginning or end of the video so people know who's doing it on campus. They also use Express Scribe with Word Perfect. It can extract audio form a video for you, and use a foot pedal to get the transcript to stop/start. Their main software for captioning ois from CPC. It's high end and expensive. Their version is $5000. But they've paid for that many times over in the last two years. The software is great, and the support has been outstanding- early on just trying to figure out how to do video captioning- more a user issue. The only problem is that it can't import flash videos. They mostly use AVI or Windows Media files. They try to stick to 25-30 characters per captions link. They also try to use markup stamps (music icon) and speaker change indication as well. The most prevalent format right now on campus if flash. Right now the export the data to XML, then use Adobe (currently CS3, soon CS4) to stick the captions in to the Flash files and synch them.

The national standard for captioning is about 6-8 hours per hour of audio/video. But with the transcript provided, the turnaround time is closer to an hour (or close to the real time of the actual video) Most time it takes a day per video.

About 90% of their time is spent creating the transcript. So they use lots of students. They do community service hours work sometimes to get free transcription as well.

Q: What are other tools you've tried and maybe haven't decided to use? What about Dragon Naturally Speaking for instance? Something for people that are only doing captioning sporadically? A. A lot of people on campus have tried this and not been happy with it.

Q: Data on cost per hour? Yes, we keep it, but don't have it on hand. Last semester the captioned about 400 videos, some 3 mintues clips, some hour long ones.

Q: Moving from captions to subtitling? A. Not yet, but considering it. Many professors don't like to see always on screen subtitles.

Q: Dragon for parroting? They haven't tried it, but they feel that they can type faster than they can talk.

Speaker: Liam from ATLAS

Normally, if they have a video that needs captioning, then the go to Angie at DRES. If it's too complex, then they jump back in. They often require users to contract with a third-party to create the transcript. Then they work from that. Their tool [missed name, but locally developed, so it may not have one- they also use Encore] takes the {usually garbage) text file, and dumps out segments of the appropriate character length for captions. Then it prompts the user to correct the mistakes. This then dumps out a GFXP.xml file with time codes. Then a third program allows you to synch up those captions with segments of audio/video.

They are working hard at making a player to show the multi-media with captions. They want captions and scene description [descriptive audio] There are currently two good players available.; one is made by WBGH and one is by Ohio State. They are both based on the GW player (?), an open source product.

For the scene description, the best practice is currently to have a second mp3 file playing (to describe the content of a PowerPoint slide, for instance). But they'd prefer to integrate it into a single file, or at least make the player that can play/control them separately (and, for instance, pause the main AV when the descriptive track has content to play).

Prototype at http://flash.atlas.illinois.edu/Prototype.html . Liam couldn't show it because the network blocks the port it needs.

Illinois Web Accessibility Conference and Expo: Session 1 – Keynote Speaker Trasan, Senior Yahoo Accessibility Manager

Why does a conversation about making content accessible to all always end up talking about people with disabilities? For most people, technology use is a choice (grab a book, hop in the car, etc). For people with disabilities, technologies is not a choice, but essential for everything, even up to cooking (microwave oven with tactile labels, audio alerts to know something is charging/finished charging). But accessibility makes life easier for everyone but for some people it doesn't just make life easier, it makes it possible.

He speaks at every new hire presentation. Firs they see the Benefits people (work hard, get shares, there excited about getting rich). They show lots of videos, and then give them a few tests. One it to give them a test to write a bunch of numbers (huge random numbers in the millions) then they describe what each of those are (number of people who are blind, how much disabled users spend each year, etc.) and this gets their attention. They just do this to make sure the concept of accessibility is in their mind from the first day.

They always ask right away how many disabled people use yahoo. Then he has to explain to them why this is a bad question- if you don't design accessible products to begin with, you won't attract any of those users anyway. And besides, we don't ask how many customers are Chinese, but we make a Chinese language site. And you'll never be able to collect this information because browsers don't indicate if users are employing screen readers or other assistive technologies.

They complain that making accessible stuff is hard! And he admits it can be, but finds that most of the time the majority of accessibility issues are very easy to fix. He wishes there were more advanced problems to deal with, because it would be more challenging. They spend lots of time with simple things ("where is the alt text for your images")- questions and problems that we should be beyond by now.

Myth 1: Accessible design means you can't have a very visual based layout/presentation. No- If you design correctly from the start- lay out content first and then put visual design on top, and then you do almost anything you want with the visual part.

Myth 2: I have to meet every accessibility guidelines out there (a testing tool that spits out 300 possible errors/problems per page). Guidelines are just there to help keep you on the right track. What we are aiming for is making web sites usable for everyone. If you only comply with 1 guideline, and your site is functionally accessible, then you're good.

Myth 3:I have no idea how assistive technologies like screen readers work- so I can't tell what is a good, useful accessible design, even when I try to use a screen reader. If something doesn't feel native to you (screen reader) then it is hard to test what you've built. They actually don't test correctly- they still indicate things on screen that must be seen, that they use the mouse (so maybe for that development you remove the mouse and turn off the screen). You must involve the actual users of the technologies you've been building for, like a screen reader. [This is the same as non-assistive technology users to test a site, and Neilson's rules of usability testing on the cheap would apply here too).

Finally, just a quick note- if dealing with accessibility becomes to mundane, you're not going to enjoy whatever it is your doing. So find the part of accessibility (designing pictures, html markup, dynamic content, testing users etc). Find the accessibility versions of that that interest you, and work on those. That might mean using testing tools on you products, interacting with real accessible technology users, etc.

[Jon interrupted to ask HIM to move along to demo, because we were running out of time]

With Web 2.0, we are pushing the boundaries of the users online experience, and that has made a lot of screen reader users very unhappy (because these new tools haven't been designed to be accessible).

He is using (and yahoo looses a lot) called MVDA, an open source screen reader for windows. The reason he isn't using JAWS is because MVDA is free and open source * you can download and play with it) and it tends to be the most precise of all screen readers- it's not forgiving. MVDA won't guess anything, unlike JAWS and other commercial products that will guess. That leads developers to believe that they've made a universally accessible site, when really they've just made a site that works with JAWS.

Demo of the current yahoo home page. Shows the new "My Favorites" module too, that allows [iGoogle like] portal experience. First how do we make this accessible for keyboard only users, then screen reader users. First notify users that when they type, something has come up on the screen. For instance- search suggestions are available for the auto-suggest drop down box that a sighted users could just see. [Although I think after the first X number of characters it should stop telling the user]. They did this with ARIA compliant markup for the live region (the input box). And an audible alert to return to the search box first (press escape key to return to search, up down to go through suggestions, right arrow to move to related materials). Every time the move between panels, there is an audible alert telling them which region they are in, and instruction on how to move back to their previous areas. Also a ton of audible alerts to explain how to move to different regions (first results, etc) after a search has returned results.

And there "intelligent searches" extends to after the search (search for this term in Wikipedia) and those links also give audible results). Those search refinement have keyboard shortcuts assigned, so users can quickly re-execute searches in specific highly used domains.

Talk about the hover feature on the yahoo mail option. That hover option doesn't work with screen readers, so they introduced additional links so you could expand those preview panels from the keyboard. He demoed a link to all things digital. He hits enter and the content area (search results area) with content from all things digital (well, in this case the demo failed, but still had content- sorry this isn't working right now). But when they click that link, focus is also moved to the beginning of that content area. When the user moves out of that preview area, they are returned to the full list of preview options. Their yahoo preview actually makes a more accessible version of the Facebook content than Facebook itself (because they scare that content and then present it in their own data wrappers/templates). They give the users a message about the content loading, or not being loaded, using ARIA live regions. They do this with a "close link" option, attached to the red X close window button/link. [I wonder why they didn't announce a keyboard shortcut for going back to the list as you enter the dynamic content region, rather than having to use so many tabs to click- like click ctrl and up or down arrow to move through the my favorites links, or just click ctrl+shift+alt+enter toe return to your previous position in the my favorites list].

Q: Are their preference for screen reader users to control how much audible feedback they get though Yahoo? A. No - They don't have anything built in to their site for preferences on what is being read or not, they leave that up to the screen reader preferences.

Q What toolkits are you using to develop this- or are you developing your own. A. This design is using YUI 3. yuiblog.com for more information on this. Check developer.yahoo.com as well for yahoo specific development. This new home page only went live a week ago, so they haven't blogged much about it yet (it was in development).

Q. How do you convey the shortcut keys to regular users? A. All via the help documentation, not anywhere on the page (status bar, etc)/ Yahoo feels every pixel counts- since they're an add driven company. So space for that type of on-screen prompting isn't available).

Q. Do you foresee a time that something very graphical, like yahoo PIPES, can be made to be accessible? A. Yes, eventually, absolutely. PIPES is only in maintenance mode now, which is why it isn't. But if you apply ARIA and good usable design, it's possible. For instance, the latest firebug, which is very complex and visually oriented, is 100% accessible now.

Q: Regarding advertisements- since they are revenue based, do they have to do something to make the ads accessible? A. That's an ongoing effort of hi, getting them in that mindset. Now it depends on the site. For the yahoo font page, more so but not 100%. He currently is trying to get at least alt content for ads, describing the content beyond "flash ad here." Right now the problem is the third party developers aren't providing enough meta-data for yahoo to intercept and remediate. Right now they markup up iframes as ads, allow screen reader users to skip them. But they really can't turn away a one million dollar ad from a company because it's not accessible.

Q: When adding instructions for navigation, in going through this process, did the navigation get improved over-all (for non-screen reader users). Yes, they found that users really appreciated they were guiding them. [I think Nick was asking about non-assistive technology user, and the speaker answer about testing with screen reader users).

Q: Suggestions for how sighted user can experiment with screen reading software? Start by browsing the web for some excellent screen reader videos. For sighted people- nothing is as powerful as watching a video of an actual user trying to user their product with a screen reader. He's made one video for that online as well (shameless plug, so search for his)/ Download MVDA and play with it. Finally, get together with an actual screen reader user and take some notes. You'd be surprised how much you can learn in a short period of time.

Thursday, September 17, 2009

First thoughts on: Nicholas et al, Student digital information-seeking behavior in context

Journal of Documentation 66.1 (2009): p106-132. doi: 10.1108/00220410910926149 (University of Illinois Access)

Why should you read this: There's plenty of data in here to compare student versus staff/faculty aggregate use of e-journals and e-books. I didn't find the detailed level of transactional analysis I was hoping for: examining individual user's paths through the library web site and these resources, and back, but I'm not exactly disappointed by this. That level of sophisticated user behavior monitoring requires an extraordinary amount of information collecting at the point of use (often requiring tracking users using cookies, and applying this tracking across multiple domains, many out of the libraries control), not to mention exhaustive vetting by IRBs and full disclosure to potential participants. This level of study is something I hope to see being generated in the near future, and I wouldn't be at all shocked to see it come from Nicholas et al. :)

In brief: Deep log analysis of a community of users (students, teachers, researchers) from the four year long Virtual Scholar program, consisting of more than three million transactions. This study analyzes users' actual interactions with two e-journal (Syngery- Blackwell, 700 journals and OhioLINK, 6000 journals) and one e-book collection (Oxford Scholarship online, 1200 titles). The researchers contrast their own findings of pervious research that generally has relied on self-selected/reported research methods (and less in-depth transactional log analysis). Also includes a very good lit review. "The university studied [regarding Synergy] has more than 2,500 full-time faculty members, slightly less than 3,000 full-time members of staff and about 9,800 students of which about 4,000 are undergraduates." p. 119 " University College London was studied with regards to Oxford Scholarship Online. University College London " has more than 4,000 academic and research staff and about 19,000 students of which more than a third are at graduate level." p. 125

Interesting quotes and my thoughts:

Highlights from their excellent review of the literature:

"students… were more likely to undertake longer online sessions." p. 106

"The literature shows that undergraduate students opt for the easiest and most convenient method of information seeking (Valentine, 1993), and appreciate the time saving characteristics of electronic resources (Dalgleish and Hall, 2000). Students are said to rely heavily on simple search engines, such as Google to find what they want. (Dalgleish and Hall, 2000; Becker, 2003; Drabenstott, 2003)." p. 108

" The young web users tended to examine briefly the first few hits on the initial results pages before performing new searches, rather than examining every hit in detail." p. 108-9

" Prabha et al. … showed that undergraduate and graduate students tend to stop looking for information when they find the required number of sources for an assignment." P. 109

And highlights from their research/findings

"In terms of the type of page viewed, surprisingly perhaps, undergraduates proved to be the biggest viewers of abstracts… The use of PDFs increased as users moved up the academic scale… Perhaps undergraduates were much more interested in cutting and pasting, something much easier to do in HTML format?" p. 114

My own experience with undergraduates makes me think this might be because undergraduates sometimes consider an abstract "enough" information (satisficing). In a study I'm currently conducting, several undergraduates explained to me that they often just looked for good sources (meaning sources their instructors had indicated were acceptable, not necessarily any analysis they were doing to determine quality for themselves) to cite. They went on to explain that they had, in the past, simply attributed some statement in a written assignment for a class to a likely article based simply on reading the abstract. They indicated that they did this because they were confident that their instructor would not perform a detailed enough analysis of their paper to discover that they had not actually accessed, let alone read, the materials they were citing (and were more confident this would be the case if an item wasn't easily accessible online).

" Most Synergy sessions did not feature the use of the internal search facility but of those that did undergraduates, as might have been expected, were the most likely to use it… Undergraduates undertook the greatest number of searches, 10 per cent of all sessions saw more than ten searches being conducted. What is not clear is whether this constituted effective searching or not" p.115-116

"While nearly two-thirds (65 per cent) of staff searches used advance search… this was true of just over half (54 per cent) of searches conducted by students. Students were more likely to undertake a search using only a simple search." p. 124

"A third of UCL usage related to the student halls of residence network, which considering that the Oxford books were monographs (thought to be more suitable for staff) rather than text books, showed a strong interest in e-books amongst students, something which conflicts with the findings of others (Anuradha and Usha, 2006)." P.125-6

I don't see this finding at all in conflict with the findings of Anuradha and Usha. To make such a claim would require comparing this particular population's use of e-books against their use of print books, especially given the extremely small size of the e-book collection (1200 titles) compared to the UCL print collection of nearly one million volumes. In fact, I now feel the need to go dig up my old notes (from my pre blogging days :) on Anuradha and Usha's work. Since those are already written, they'll get posted before this review. There, done. See my post First thoughts on: Anuradha and Usha, Use of e-books in an academic and research environment: A case study from the Indian Institute of Science. The only fact I can draw from this data is that students seem to use more e-books than staff do (at least students in residence halls and staff in their offices- who knows about staff use from home or off-site office/coffee shops, etc) and not that students, generally, use e-books in any substantial sense when compared to their use of print books (or other electronic information sources, like free web sites, electronic journals, etc)

"Finally, students were more likely to find OSO titles via the Library catalogue… [than]… staff." p. 126

This is more evidence that, for undergraduate students at least, it is vitally important to make sure that access to e-books is provided via the catalog, or via whatever tool is the default/go-to tool for students when looking for library provided information. This could just as easily be a meta/federated search tool that incorporated book the library print and electronic bibliographic information, e-journal collections, and librarian generated electronic content, rather than a traditional catalog, so long as it was the primary tool for locating information- something most likely facilitated by making it the most prominent search (or only search) off the library's web site.

"Thus the usage profile [for the use of e-journals] of undergraduates is that they conduct many sessions but do not view a lot of pages during a session… This all fits the picture of students as "bouncers" established by the authors (Nicholaset al., 2007). However, this turned out not to be the case with e-books where students viewed more pages in a session than staff. This could be because e-books are a more appropriate form of e-resource to students, which seems logical." p. 128

"In regard to full-text viewing, the Synergy study showed that the use of PDFs increased as users moved up the academic scale, from undergraduate to professor/teacher." P. 128

"The OhioLINK study showed that students were more likely to record long online sessions lasting more than 15 minutes… Students were much more likely to read online than other academic groups and this was partly to do with personal preferences and partly to do with the print charges students are faced with in many institutions… a survey conducted by Outsell … found that undergraduates were more willing to rely on electronic resources than graduates and faculty, with approximately half using electronic resources exclusively or almost exclusively. A survey carried out at the University of Strathclyde… found that the majority of [student]users (94 per cent) read them on-screen." p. 129

" Overall… e-resource use… is on the increase and there is a reliance on simple searching, and students get better at searching as their skills as they progress to the higher stages of their studies." p.129

"It would be a mistake to believe that it is only students' information seeking that has been fundamentally shaped by huge digital choice…Virtual Scholar research has shown that a considerable number of users exhibit a bouncing/flicking behaviour, which sees searching conducted horizontally, rather than vertically. Power browsing and viewing appear to be the norm for many; reading appears to be undertaken only occasionally online, probably undertaken offline and in some cases not done at all." p.129-30

First thoughts on: Anuradha and Usha, Use of e-books in an academic and research environment: A case study from the Indian Institute of Science.

Program 40.1 (2006): p 48-62. doi: 10.1108/00330330610646807 (University of Illinois Access)

A quick note: I'm sorry that this doesn't quite follow my normal review style. I actually dredged this up from notes I took over two years ago on this article. I wanted to post them, though, because an article I'm reviewing right this moment (and am about to post a review for :) mentions this article and indicates that their own research is at odds with Anuradha and Usha's- an assertion I don't agree with. Rather than trying to just take a few excerpts from these notes and post them in the other review, I decided to just go ahead and post all my notes for this article. By the way, if you ever are wondering if I have notes on a publication regarding e-books (in particular academic e-books) on hand- well, I probably do. Just fire off an email or blog comment and I'll dig through my older notes for you.

Why should you read this: This is definitely a must-read for anyone interested in e-book use in academic libraries.

In Brief: Analysis of a 27 question e-mailed user survey of patrons of the Indian Institute of Science's 2,200+ active researchers (students and faculty) to which 101 responded in 2004 about their use of and feelings about e-books in general and several library trials from the publisher/vendors ebrary, Kluwer and Engineering Village. Starts with a good introduction to what e-books have been defined as in the past (50-51).

Key Quotes & My Thoughts:

Anuradha and Usha found, in their 27 question e-mailed user survey of patrons of the Indian Institute of Science's 2,200+ active researchers use of ebrary, Kluwer and Engineering Village that, "like all internet-based resource, e-books break down geographic barriers" (Anuradha and Usha, 49) which is true, in so much as too people at two different locations can access an e-book equally well, provided, of course, that they both can access the internet where they are. If they are not, then it is unlikely they will be able to use the e-book as easily as a traditional paper book. Although getting access initially to the book is enhanced, in cases where e-books can't be used offline, the print book, once initially obtained from the library, provides more consistently reliable continual access than most e-books, do to the unnecessary restrictions most e-book providers place on the ability to save or print e-books contents for offline viewing, even in those places without internet access. Anuradha and Usha summarize their own research and list the many real and potential advantages e-books have offered for several years now to both librarians and patrons but immediately acknowledge that "In spite of these advantages, e-books are still not very popular." Among the possible reasons they list for this are "limited availability of titles… difficulty in accessing computers or the internet… problems with printing and downloading" (Anuradha and Usha, 51).

Of the respondents (101) 60 had used e-books. Of these 60, 52 indicated that they would want to use/read e-books from the Library (54). This is not surprising, given then computer science, technology and other STM electronic publications generate more use than other subjects in most of the literature. What is surprising is that although 87 percent of the respondents who had used e-books at some point wanted to use e-books from the library, but that of the 55 respondents who gave an answer to the question "Overall how satisfied are you with eBooks?" none said they where extremely satisfied, while 37% said they here very satisfied, but 55% stated they where only somewhat satisfied while 8% where unsatisfied (Anuradha and Usha, 55). That means that although on the one hand this study indicates that 87% of the surveyed population wants to use library e-books, of those that have used them, fully 63% aren't satisfied with the e-books they have used. This, despite the fact that 70% would definitely recommend, and another 17% probably recommend, e-books to others. These seemingly incongruous findings might be accounted for when you consider the reasons they commonly gave for using e-books, and about what features of e-books that impressed respondents, the clear consensus of useful features were: search tools to locate words or quotes (72%) and instant access to content (63) followed by mobility (50%) (Anuradha and Usha, 56), so there are clearly features that are unique to the e-book as a location independent access as compared to the print book. However, mobility is a more difficult concept to define. Mobility as it pertains to e-books is usually tied to the DRM methods employed to protect the publishers content rights management and distribution interest. In all of the publishers listed in this study, there is no possibility for downloading e-books to a device that could later be used at a location that does not have access to the internet.

Among the respondents reasons for not using e-books, at the top of the lists where the fact that they where "hard to read/browse" at 22%, "lack of familiarity with products" at 19%, "used to reading print books and no wish to change" at 18% and cost at 17%. Another interesting reason, considering that all of these researchers have a high reported access and use of computer, with 100 having used a computer for over a year, and 80 percent having used computers for more than five years is that 11 percent reported difficulty in accessing computers/internet. This would seem to indicate that although they can access a computer and the internet periodically, it is not ubiquitous enough for them that they can always easily access they e-books under the format and conditions they where being provided by the publishers investigated in this survey… all publishers who subscribe to the "online or internet based" as opposed to "offline" (50).

"The main features of e-books that were disliked were the incompatibility between different suppliers,, lack of user friendliness of interfaces, the problems associated with usernames and passwords, and the variety of devices available in the market" (58).

"Thus, by carrying out user education, publicity, raising awareness about the software/hardware used for e-books, increasing bandwidth and making e-book reader devices available along with e-books through the libraries, the use of e-books can be increased" (59).

Web Accessibility for Online Learning

Web Accessibility for Online Learning, written by Hadi Rangin, Web Design and Accessibility Specialist for the Illinois Center for Information Technology and Web Accessibility, Disability Resources and Educational Services (DRES) (University of Illinois at Urbana-Champaign) is a must read for anyone creating online course materials. You can meet Hadi in person at the upcoming Illinois Web Accessibility Conference and Expo on September 29th at the Alice Campbell Alumni Center in Urbana, Illinois.

Wednesday, September 9, 2009

New Apple iPod products out, but where's my 32/64 gig nano?

Ah Apple, how you wound me. So often you give me what I need (let's see, I own, or have owned, six- maybe seven - iPod variants over the years). And what I need now is a small, flash based MP3 player. I just love my 16 gig nano, but it's out of room. I had waited with bated breath for today's iPod rollout, expecting to see a new 32 gig (and maybe even a 64 gig) nano. And instead you just cram new features into the same old shell. A (video) camera, a larger screen, fm support (and recording) are all neat, but you could have just left those areas for the Touch (which is the bells and whistles model anyway). What I need is a single purpose device (mp3 player) that is super small, with enough flash based storage for all my content and a reasonable battery life. 32 gigs will do for my music, but now that I'm using Audible (after they finally upped their audio quality to the "enhanced" – which for me means tolerable – audio format) I need 64. I'll keep an eye out for the rest of the day. Maybe more models are yet to be announced? Hope, hope, hope…

Upcoming Event at the University of Illinois: Illinois Web Accessibility Conference and Expo

The Illinois Web Accessibility Conference and Expo is an all day event Tuesday, September 29th at the Alice Campbell Alumni Center at the University of Illinois. $50 (early registration, $75 after Sept. 22. Metered parking available $0,75/hour or get an all day parking pass for $9.00 from the Campus Parking Department)

Keynote Speaker: Victor Tsaran, Senior Accessibility Program Manager at Yahoo.

This will be the best bang for the buck as far as a accessibility conferences go (the price includes a boxed lunch)! And, if you're interested, I'll be part of one of the panel discussions (the Content Management Systems II: Illinois Efforts panel), so stop by and heckle me. :)

Thursday, September 3, 2009

First thoughts on: Soules, The shifting landscape of e-books

New Library World 110.1/2 (2009): p7-21. doi: 10.1108/03074800910928559 (University of Illinois Access)

Why should you read this: This is a pretty good summary and comparison of the most salient and striking findings of the ebrary surveys. I also enjoyed the author's own insights gained from on-the-ground interactions with users, librarians, and library staff. If you haven't read the ebrary surveys yourself (and don't feeling like wading through them) this article serves as a pretty good surrogate. It's also well written and a generally pleasurable read.

In Brief: Review of the some findings in the 2007-2008 ebrary e-book survey of libraries, librarians, and users about their perceptions and use of e-books, with added information provided from the author's own experiences.

My thoughts:

I understand the author's desire to see the teaching of the advanced features of e-book not supported (or at least certainly frowned upon :) in the print world- like annotating and highlighting text, or the automatic creation of citations, or any cool, useful feature that would, once taught, drive students to use e-books. However, at this point in the evolution and adoption of the e-book, that just doesn't seem practical. Every vendor/publisher platform offers different variations of "advanced features," and e-books as content are very transitory at this point (see the author's mention of a chapter of one e-book being revoked, or e-books moving between platforms from year to year). On the other hand, if we do come to a point where there is a standard file format for e-books that users can download and permanently archive (similar to a PDF, but based on something more flexible and non-proprietary, like the open .epub standard as a delivery format and not, as it seems more likely to become, an industry storage/exchange format that gets converted to a proprietary format for delivery to end users) then at that point we can start to teach advanced ways to interact with e-books (like highlighting and annotating). Once those features can easily be overlaid onto e-books, independent of the platform they were hosted on/delivered through, in a completely publisher/vendor agnostic way (think of the various open standards that dictate how web content is authored, and how any number of web browser and other application can easily manipulate this content in any way), then, at that point, we should rush out and start teaching users how these advanced approaches can make their digital consumption easier and richer (at the same time :) than print consumption. Once we get to that point (_sigh_, I mean if) then librarians and instructors can begin to offer lessons on the advanced manipulation of texts, knowing that the tool they are teaching (say the Zotero of e-book markup and manipulation) will remain available to their students for use indefinitely (okay, I know that's never going to happen, where talking about file formats and electronic devices after all) or at least for a reasonably long period of time, then we teach them how.

Interesting quote: "Fundamentally important is the need to broaden the concept of an e-book" p.10

Interesting quote: "Ability for more than one student to use an e-book at the same time… When students reach the [license based access] limit, they do not know why… They do not connect e-books with the concept of circulation. They are used to entering databases without a user limit." P. 13

Interesting quote: "The students… basically want to cut and paste into their research papers, print for easier reading or reading when online is not practical (on public transit, for example), or download to use later… One [barrier to this use] is clearly copyright and publisher concerns." p.13

Interesting quote: "There are a couple of other drivers that will influence how soon students become fully used to e-books. One of these is the textbook market." p.14

Interesting quote: "Thus, the tide appears to be shifting generally in science, humanities, and social sciences, with e-books undergoing a slow evolution rather than a dramatic revolution." p.15

Interesting quote: "Faculty does think that there are too many technical restrictions on e-books, citing printing, number of users, etc." p.16

Interesting quote: "e-books currently tend to be more expensive than print books." P.17

Interesting quote: "Despite the issues and despite slower-than-expected evolution, e-books prevail" p.19

Warning- nitpicking and good-natured ribbing follows. I have a slight issue with the word choice here. I don't feel that e-books, at this point, have really lived up to the hype to the point where I'd see them as deserving of the term prevail, as in
"effectual or efficacious; successful." If we consider the generally obsolete use of the term, "to become very strong; to gain vigour or force; to increase in strength," well, that I'll give the academic e-book market. They have been gaining force/popularity/prominence, just not at the rate we'd like (and keep getting promised).

I do concur with the authors conclusion that the ascendency of the e-book in general (and academic use/prominence in academic libraries in particular) is happening now, and will displace print books eventually, although the exact format of and method of delivery/access may change dramatically before we get to that point. The author, wisely (unlike many pervious e-books proselytizers) is not lured into making predictions on just when this might happen.

First thoughts on: Abdullah & Gibb, Students' attitudes towards e-books in a Scottish higher education institution: part 3

Library Review 58.1 (2009): p17-27. doi: 10.1108/00242530910928906 (University of Illinois Access)

Why should you read this: Well written and concise. I personally found it very thought provoking (although I'm betting there are many, many articles and blog posts out there that suggest largely the same things I do…) Although this article generated some interesting ideas for me (see My Initial Thoughts) I found the study too limited to be of practical, generalized applicability (see In Brief) when considering the performance of TOC versus BOB indexes versus full-text searching.

In Brief: 45 users (Masters students in the department of Computer and Information Science) were studied as they attempted to perform tasks related to locating relevant content (quick look-up reference style consultation, not in-depth reading, related to quick factual items in the books, or drawing conclusions through analysis of small portions of the e-book) in PDF format e-books (non-fiction, Computer Science, with TOCs and indexes), contrasting the effectiveness (defined by authors as how efficient/fast the task was completed, how effective/successful the task was performed, and how useful the user perceived each feature to be) of full-text-searching, Table of Contents, and the Index to locate the necessary information. The study finds that the use of the Index was more efficient (faster) that using the TOC of FTS, but not necessarily more effective (correct answer located) than using either the Table of Contents or full-text searching.

My initial thoughts: Limiting the type of e-book studied to PDFs was probably overly restrictive, and the biggest weakness of this study. It makes this more an evaluation of Adobe Acrobat Reader Search within PDF documents versus more traditional (print media) tools for finding information (TOC and index), than an actual evaluation of the comparative usefulness of full-text searching over the use of TOCs and indexes. In other e-book platforms, full-text searching is not necessarily so brute force (find this word/phrase, then keep clicking through each occurrence, in a linear fashion, until you find the most relevant section) and often does (or could) employ more advanced relevancy ranking techniques (even those based on additional text mark-up practices) than the search functionality in PDF documents. Thus, I don't think I find the author's evaluation that indexes are generally more efficient for finding information that full-text searching to be very persuasive. Even the PDF search tool could (if Adobe chose to do so) be vastly improve to offer a more relevancy ranked set of results to full-text searching rather than just marching linearly through all matching keywords in the document (so long as additional metadata was available for that particular PDF document). The very back of book indexes the authors find so useful could easily, in an electronic format, be used to weight full-text keyword searchers more highly, using the existing data already provided in the BOB index and TOC. In that scenario, if a user searchers for a keyword at it appears in the index or TOC, or especially in both, then the search results would offer the pages reference in the TOC and index with those matching keywords first, and only offer the rest of the linear results second), marrying the convenience of automated full-text searching with the added value of human created indexes.

However, I agree wholeheartedly with the researcher's assertion that (basically) TOC and BOB indexes (indexes in particular) are generally really good things. When the age of the e-book finally arrives, we'll still need actual humans to continue to create and apply meta-data of all types to books (and many other types of information), particularly in the areas of creating indexes.

The authors' argument that including TOC and BOB index information as searchable content in library catalogs does seem like a good one. I'd even go so far as to say that TOC and index terms should probably be given a heavier weight in ranking results than term in the subject/descriptor fields, especially for narrowly defined tasks like finding a small particularly relevant bit of information in a book, versus wanting to find and read an entire book about a topic. But, I think that would only be a small, baby-step towards the inevitable type of information interfaces users want (or will want) – to be able to run a single, full text search across all the relevant content within their domain (or related domains). This means a we need a single search tool that provides full-text searching across books, journals, web sites, basically anything the librarians choose to include in the, for lack of a better word, catalog. Of course, pulling this overwhelming amount of content together and returning the results in a well sorted, relevancy ranked way for the end user will require that we do more than simple TFIDF style rankings. We'll need to leverage that incredibly useful additional human-generated metadata, like that contained in indexes (and in the future, through additional mark-up applied directly to the texts, internally, rather than as separate entities like most indexes currently are). I know I would much rather use that mythical catalog for searching, and have it return not only the metadata about the book (title, subjects, etc) but show me the keywords in context of the most relevant page that contains my keywords, based on that ever so useful information contained in the TOS, index (and someday) internal text-markup. Imagine a day when the indexing of books will not simply point to the page, in a serialized linear way that have a significant use of the word/term/concept, but where each entry in the index is individually weighted. In that case, we might find that a search determines that the use of the term the user entered occurs on page 37 of the book, and is ranked the #1 most important use of the term in the book, even though the term appears 17 times in the book before that point, so it lists page 37 first, rather than the preceding pages that contain the term. Combine this with multiple word searches (or for truly advanced users, nested Boolean searches) and we can suddenly get very fancy with the ranking algorithms. A user searches for [term1 AND term2]. The search/ranking algorithm finds that (along with all the other occurrence of the term) that these terms only appear on the same page/close to each other a few times- term 1 appears on page 17, rank #3, and page 33 rank #2, and page 56 rank#50; term 2 appears on page 17 rank #15, page 33 rank #48 and page 56 rank # 11. We can calculate that page 17, even though it isn't ranked #1 for either term, may very well be the "best" result to offer to the user first, with the same logic applied to sort and organize all the rest of the term matches.

Wednesday, September 2, 2009

404 Tech Support to the Rescue – netbook touchpad annoyances

Jason overheard me grumbling with a friend about an annoying problem we both had with our netbooks- that the touchpad on them, even at the lowest sensitivity setting, picks up accidental brushes of our palms when typing (especially when typing in a hurry). The result was the cursor jumps to somewhere unexpected, and (worst case scenario) highlights a huge section of the document for overtyping. Ack. He had already tracked down a solution, and sent it my way. I felt like I should share. :)

404 Tech Support: A Week of Google Code, Day 1: touchfreeze

The Overly Caffeinated Librarian