<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4231653014599940335</id><updated>2011-08-23T10:24:00.102-07:00</updated><category term='TracSNAP'/><category term='Clipboard'/><category term='Ubuntu'/><category term='UCOSP'/><category term='POSIT'/><title type='text'>Sarah Strong ● POSIT: Portable Open-source Search and Identification Tool</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>12</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-6361078424367530610</id><published>2011-01-26T21:02:00.000-08:00</published><updated>2011-01-28T12:52:27.491-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='POSIT'/><category scheme='http://www.blogger.com/atom/ns#' term='UCOSP'/><title type='text'>Sprint Wrap Up for POSIT, UCOSP 2011</title><content type='html'>This weekend about fifty students from all over Canada got together at the University of Alberta campus for a weekend of coding for our new open source projects. We spent three days getting to know each other and the codebase and had a blast doing it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;POSIT: Portable Open Source Search and Identification Tool&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;POSIT is an Android app and web project designed to help track finds in the field. Humanitarian aid workers can plan searches by looking at what's already been covered and tag finds with photos and their exact location as they survey the area.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Team&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Edward Bassett of the University of Alberta worked on POSIT for UCOSP last semester. He was there all weekend to help the new team get a handle on the project.&lt;br /&gt;&lt;br /&gt;Shawn Gryschuk of the University of Saskatoon has already jumped into user interface improvements.&lt;br /&gt;&lt;br /&gt;Eran Henig of the University of Toronto couldn't make it out to Alberta but was an everpresent collaborator over Skype and IRC.&lt;br /&gt;&lt;br /&gt;Stas Kalashnikov of Simon Fraser University fixed our first big bad bug of the season, stopping images on the server from multiplying like bunnies.&lt;br /&gt;&lt;br /&gt;Ralph Morelli is the POSIT project mentor at Trinity College. He made himself available all weekend over voice and video chat to help us get to know the project and get a feeling for where he'd like the project to go this semester.&lt;br /&gt;&lt;br /&gt;Dustin Morrill of the University of Alberta fixed our second big bug, allowing POSIT to fall back gracefully to network location data when GPS is unavailable.&lt;br /&gt;&lt;br /&gt;Jon VanAlten of the University of Alberta hopes to allow finds to be synced over SMS when data networks are unavailable.&lt;br /&gt;&lt;br /&gt;Sarah Strong of the University of Toronto hopes to offer features to support real time collaboration on searches.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Sprint: wrestling with Android and hunting bugs&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;We spent most of Friday getting our development environments set up and making sure that we can debug code on the phones and emulators. We also bumped up against permission control limitations of Mercurial hosting on Google Code and had to rethink our collaboration processes. By the end of the day, most of us were up and running and we could get started on exploring the project. We even started to document bug we found as we poked around.&lt;br /&gt;&lt;br /&gt;On Saturday, Ralph ran us through a whirlwind tour of the codebase and we started to get set up to work on bug fixes. We wound up settling on each using a clone of the experimental branch as our personal repo, and asking Ralph to pull in changes when we have something worth incorporating into main. Jon started to put together a development POSIT server for us to tinker with.&lt;br /&gt;&lt;br /&gt;Our last few hours on Sunday were spent tying up loose ends and planning for the rest of the semester. We're hoping to get bug fixes and some code cleanup done in the next couple of weeks and make a preliminary release in the next month. The last release was in June and our first release will bring in new fixes and features that are at a stable point. We'll work on pitches for features or projects of our own over the next few weeks and we'll each put together a final project. We're bursting with ideas but I'll leave the details for when we have more of an idea of what's feasible. By the end of the semester we hope to release a second version that incorporates the work done by UCOSP students last semester.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-6361078424367530610?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/6361078424367530610/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2011/01/sprint-wrap-up-for-posit-ucosp-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/6361078424367530610'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/6361078424367530610'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2011/01/sprint-wrap-up-for-posit-ucosp-2011.html' title='Sprint Wrap Up for POSIT, UCOSP 2011'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-3608227081534913258</id><published>2010-06-08T20:42:00.000-07:00</published><updated>2010-06-17T11:23:45.975-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Ubuntu'/><category scheme='http://www.blogger.com/atom/ns#' term='Clipboard'/><title type='text'>Clipboard managers for Ubuntu</title><content type='html'>&lt;div&gt;&lt;b&gt;Patching is hard, let's go shopping! ...for clipboard managers&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;I had hoped to have at least one patch finished by now, along with general guidelines for implementing the fix in other applications. GTK+'s implementation of textbuffer already respects the&amp;nbsp;&lt;a href="http://www.freedesktop.org/wiki/ClipboardManager"&gt;ClipboardManager&lt;/a&gt;&amp;nbsp;specification, making applications that use it fixed by default. Any other usage of custom built text areas in GTK+ seems to vary so wildly from application to application that a specimen fix makes little sense. I'm holding out hope that I can factor out commonalities in the fixes of GTK+ applications to build a library solution, but I haven't had much luck yet.&lt;br /&gt;&lt;br /&gt;I've begun trying to patch vim, openoffice.org, and empathy to conform to the spec and with no success so far. As a change of pace, here's a survey of the clipboard management field as it stands. If we were to decide to fix this problem by bringing a more fully featured clipboard management application up to standard so it could be released as a part of Ubuntu main, it would free us (me!) from having to patch each application individually. A girl can dream, right?&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;Klipper&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Clipboard management isn't a problem in Kubuntu due to the tight integration of Klipper into KDE. It sits in the taskbar intercepting any and all copies performed in the system and making them available both after application quit and as a history in a panel app. Installing klipper in gnomic Ubuntu brings in a whole lot of kde libraries. On Ubuntu Lucid, I found it threw a warning, "QClipboard::setData: Cannot set X11 selection owner for PRIMARY," and failed to preserve clipboard contents after quit. The copied text&amp;nbsp;&lt;i&gt;was&lt;/i&gt;&amp;nbsp;available in klipper's history, accessible by clicking on the panel icon.&lt;/div&gt;&lt;div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;b&gt;GSD-clipboard-manager&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;Gnome Settings Daemon's clipboard manager runs in the background in Gnome systems including Ubuntu, taking care of preserving clipboard contents after quit by implementing the &lt;a href="http://www.freedesktop.org/wiki/ClipboardManager"&gt;ClipboardManager&lt;/a&gt; specification. This only works for those applications kind enough to implement that same spec themselves.&lt;br /&gt;&lt;br /&gt;One option to consider in this project is to integrate some of the functionality from the panel-based clipboard managers into GSD, keeping a record of each copy as it's performed without keeping a history or adding a panel applet. I've looked through the source and that looks quite doable, but the major consideration is whether it can be done without an&amp;nbsp;inordinate&amp;nbsp;impact on speed or reliability of GSD. Right now, it only registers copies when applications request it. Acting as a full clipboard manager would cause it to record many selections and copies that are never actually needed. We could perhaps reduce the load by supporting persistence only on the default (ctrl-V) register in conforming apps, at the expense of consistency.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;b&gt;Clipman&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Clipman is a component of Xfce and depends on many xfce libraries. It also has the same behaviour as Klipper when installed in regular old Gnome-based Ubuntu.&amp;nbsp;What makes it interesting is that it includes gsd-clipboard-manager.c, the clipboard management plugin provided with gnome-settings-daemon, within its source. If we were to extend the functionality of gsd-clipboard-manager, this could be a good place to start.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Glipper&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://glipper.sourceforge.net/"&gt;Glipper&lt;/a&gt; was written as a Gnome-based alternative to Klipper. It installed oddly on my system, providing no executable in my path and no entry in gnome menu. Running the binary installed at /usr/lib/glipper/glipper provided no panel applet but did preserve clipboard contents after quit. It seems it's meant to run as a panel applet, so if I go down the route of working to bring it in as a default part of Ubuntu, I'll have to fix the install process. &lt;a href="http://ubuntuforums.org/showpost.php?p=7477756&amp;amp;postcount=12"&gt;Users complain&lt;/a&gt; that it uses too much memory and is buggy. It's a python-based application that was last updated in 2007.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Parcellite&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://parcellite.sourceforge.net/"&gt;Parcellite&lt;/a&gt;&amp;nbsp;is another Gnome clipboard panel applet, this time officially abandoned April 2010. It's written in C and I prefer its source to Glipper's mostly based on excellent commenting. It installed cleanly on my system, giving me an entry in "Add to panel" as Clipboard Manager. It preserved the clipboard after quit while sitting in my panel, collecting a history, and I wasn't able to reproduce the one bug I found &lt;a href="http://ubuntuforums.org/showpost.php?p=7804602&amp;amp;postcount=33"&gt;reported&lt;/a&gt; about it, but I did find that its no-panel-applet daemon mode failed to preserve clipboard contents.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Conclusions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Extending an alternate gsd-clipboard-manager that includes persistence seems like it would be worth doing if it could be done without noticeably impacting performance or reliability. GSD, however, is an essential system service that needs to be so speedy and reliable that I'm not sure it's feasible, at least for me.&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;It would be a good idea, I think, to make the Clipboard Manager panel applet (parcellite) available on a default install, without actually adding it to the default panel. That way we give users the option of fixing this bug in an easily discoverable sort of way without potentially running down slow systems or adding to panel clutter. I'll bring it up with my GSOC sponsor, &lt;a href="http://gould.cx/ted/blog"&gt;Ted&lt;/a&gt;, and ask whether it would be appropriate to suggest it and spend time during my employment testing, fixing, and readying it for inclusion. That seems unlikely to happen at this late date, so I'll probably take up the maintenance and the cause once my time as a GSOC student wraps up.&lt;/div&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-3608227081534913258?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/3608227081534913258/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2010/06/clipboard-managers-for-ubuntu.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/3608227081534913258'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/3608227081534913258'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2010/06/clipboard-managers-for-ubuntu.html' title='Clipboard managers for Ubuntu'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-7667827367383374360</id><published>2010-05-21T02:17:00.000-07:00</published><updated>2010-06-17T11:23:45.976-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Ubuntu'/><category scheme='http://www.blogger.com/atom/ns#' term='Clipboard'/><title type='text'>Gearing up for GSoC: Clipboard Persistence for Ubuntu</title><content type='html'>&lt;b&gt;GSoC &amp;amp; Ubuntu Clipboard Improvements&lt;/b&gt; &lt;br /&gt;&lt;br /&gt;This summer I'll be tackling a Google Summer of Code assignment with Ubuntu keep clipboard contents from being lost when an application quits. You can check out &lt;a href="https://wiki.ubuntu.com/GSoC/2010/SarahStrong"&gt;my application here&lt;/a&gt;, developed with lots of input from my mentor Ted Gould and Ubuntu developers James Westby and David Bensimmon on IRC and &lt;a href="https://lists.ubuntu.com/archives/ubuntu-soc/2010-April/000211.html"&gt;the ubuntu-soc mailing list&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The problem: data loss on quit&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Say you're writing an email in your word processor. If you copy it, paste it into your email client, and then close the word processor, you're golden. If you copy it, close the word processor, and try to paste, you're probably out of luck. It's an odd case, but when it happens it means loss of user data.&lt;br /&gt;&lt;br /&gt;The problem happens because Xorg takes a conservative approach to copying. It copies only a reference to the original data when the user performs a select or copy. It doesn't go and retrieve the actual data from the source program until the user requests a paste. It saves a lot of unneeded transfer of data this way, at the expense of having no way of retrieving data from a closed program that hasn't saved its clipboard somewhere else. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;The solution: save on exit&lt;/b&gt;&lt;b&gt; &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://freedesktop.org/wiki/ClipboardManager"&gt;Freedesktop's ClipboardManager specification&lt;/a&gt; comes to the rescue. Gnome settings daemon, the component of Ubuntu that handles all copying and pasting by default, conforms by allowing applications to explicitly request to save their clipboard contents in a safe place. Applications conform by requesting a save &lt;i&gt;before&lt;/i&gt; they exit. Everything gets squared away before a quit and we don't lose any data. Unfortunately, there are very few applications that conform to this standard, and we believe that few developers are even aware of the problem. I hope to put together an online guide to fixing the problem while patching several popular Ubuntu programs myself.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A batch fix&lt;/b&gt;?&lt;br /&gt;&lt;br /&gt;The adhoc approach of fixing a series of apps seems like it'll work, but we're looking for a more systematic way to knock out the problem in many places at once. Right now, we have a variety of clipboard history applications that provide clipboard persistence by keeping track of each copy performed. These panel apps get the job done, but they're probably too heavyweight for default inclusion in Ubuntu. A lighter weight solution might be to create a library that GTK+ application developers can use to easily fix this problem. I'll be comparing existing fixes to figure out whether this is a feasible approach.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Things to watch out for: performance problem, format support, upcoming GTK+ improvements&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are several things to keep in mind as I start investigating the problem.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;I've seen &lt;a href="http://src.chromium.org/svn/trunk/src/app/clipboard/clipboard_linux.cc"&gt;reports&lt;/a&gt; that saving clipboard data can have a performance hit. I'll need to make sure I'm not imposing an unacceptable performance hit before I push out changes to a bunch of programs.&lt;/li&gt;&lt;li&gt;An application can broadcast that it can provide a picture the user has copied as a jpg to an image program and as a text link to an editor. I'll need to make sure I'm not imposing any regressions to multiple format support on any changes I make, and report on support status as I go.&lt;/li&gt;&lt;li&gt;There are some changes that might be coming down the pipeline to GTK+ this summer with the addition of a base application class. Ted mentioned that the sort of changes I'm proposing might be made easier then. All the more reason, then, to keep what I'm doing well documented so it can be easily reimplemented when there's a better place to put it.&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;Timeline (so far...)&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Weeks one and two, &lt;/b&gt;May 24 - June 6, 2010&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Create an example program that exhibits the problem&lt;/li&gt;&lt;li&gt;Fix the problem in the example program&lt;/li&gt;&lt;li&gt;Put up a website that describes the problem and shows how to fix it using the example program&lt;/li&gt;&lt;li&gt;Add a page of extra links and explanatory material for anyone who's researching the problem; have a section at the top for users who might just be googling the behaviour.&lt;/li&gt;&lt;/ol&gt;&lt;b&gt;Weeks three and four, &lt;/b&gt;May 7 - 21, 2010&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Compile a list of existing patches that fix this problem and add them to the link page&lt;/li&gt;&lt;li&gt; Read them to get an idea of how it's being fixed and where there are commonalities that could be factored out&lt;/li&gt;&lt;li&gt;Fix the problem in one real application&lt;/li&gt;&lt;li&gt;If it seems like a common library to make future fixes easier makes sense, put together a proposal for it&lt;/li&gt;&lt;li&gt;Update documentation page so that it would have been useful to me in fixing the real app I tackled&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-7667827367383374360?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/7667827367383374360/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2010/05/gearing-up-for-gsoc-clipboard.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/7667827367383374360'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/7667827367383374360'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2010/05/gearing-up-for-gsoc-clipboard.html' title='Gearing up for GSoC: Clipboard Persistence for Ubuntu'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-7519930779085777528</id><published>2009-07-31T12:04:00.000-07:00</published><updated>2010-05-18T10:13:53.026-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Wrapping up for the summer</title><content type='html'>I just spoke to Steve about plans for wrapping up our projects for the summer, and he asked me to write up a summary of the steps we'll need to take before the fall.&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Prepare for the &lt;a href="http://web.cs.toronto.edu/news/events/AUG_18___Undergraduate_Summer_Research_Poster_Session.htm"&gt;poster session&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Make screencasts&lt;/span&gt;&lt;br /&gt;We're planning on releasing demos as screencasts for each project. Maybe we should make a quick and dirty one right away to get feedback and throw light on those features that need work before the end of the summer, and then make a nicer one in a few weeks?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Move the source to a public site such as sourceforge&lt;/span&gt;&lt;br /&gt;If we do this right away, we can ticket code cleanup tasks to give us a nice roadmap for the end of the summer, and for any future development on the projects.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Provide documentation&lt;/span&gt;&lt;br /&gt;Make sure code is commented nicely and provide an overview for future developers.&lt;br /&gt;Also include help/about/how-to information in the app, where appropriate, for end users.&lt;br /&gt;We can also do light refactoring and clean up code as we comment.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Fix bugs and remove hacks&lt;/span&gt;&lt;br /&gt;It's easy to leave in buggy features in a single developer project if you know the workaround. We'll need to fix that up. A better testing suite would be nice to have, but might be of lower priority than fixing the bugs we know we have.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Add one or two killer features&lt;/span&gt;&lt;br /&gt;We'll need to brutally triage so we don't ignore the boring but necessary cleanup and documentation tasks, but we got great feedback this week from people who watched our presentations and we've all got a couple of features we'll want to be able to leave the project with&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;By way of example, here's how I think we could apply this to our project, TracSNAP.&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Prepare for the &lt;a href="http://web.cs.toronto.edu/news/events/AUG_18___Undergraduate_Summer_Research_Poster_Session.htm"&gt;poster session&lt;/a&gt;&lt;/span&gt;&lt;br /&gt;The screenshots and explanation on the poster could be reused as documentation and info for the front page of our TracHacks page.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Make screencasts&lt;/span&gt;&lt;br /&gt;I already have an idea of what I want fixed for our screencast from the demo session we just did. A quick one soon would be good, though.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Move the source to a public site such as sourceforge&lt;/span&gt;&lt;br /&gt;TracSNAP belongs on &lt;a href="http://trac-hacks.org/"&gt;TracHacks&lt;/a&gt;. I'll throw our source up there and start ticketing changes as soon as I get the go-ahead from Ainsley. &lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Provide documentation&lt;/span&gt;&lt;br /&gt;I can go through the code and check for any egregious omissions in commenting this weekend. On Tuesday, I'll add barebones user-accessible help on each feature. We'll put together an overview of features for the poster session and our TracHacks plugin page next week. Beyond that, I think maybe better documentation should wait until we've fixed a few bugs and decided on what features we'd like to add by the end of the summer..&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Fix bugs and remove hacks&lt;/span&gt;&lt;br /&gt;We have several known bugs and ugly workarounds in our project, and moving to a real project management system and ticketing them is the first step. Then we'll prioritize and work on fixing them.&lt;br /&gt;Most pressing: Update repository data on every commit. Remove extra tabs only used for development. Grab real emails and work on mapping Trac logins to subversion logins in a sane, if not perfect, way. Decide on saner algorithms for determining relatedness and expertise.&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Add one or two killer features&lt;/span&gt;&lt;br /&gt;Since &lt;a href="http://code.google.com/p/jsviz/"&gt;JSViz&lt;/a&gt; seems to be pretty broken for parent nodes with &gt;18 children, moving to &lt;a href="http://flare.prefuse.org/"&gt;Flare&lt;/a&gt; is probably a key feature. This is &lt;a href="http://individual.utoronto.ca/ainsley/summer09.html"&gt;Ainsley&lt;/a&gt;'s department and I'll defer to her judgement on whether it'll be doable in less than a month. &lt;a href="http://skoolr.blogspot.com/"&gt;Jon Pipitone&lt;/a&gt; suggested we get together with &lt;a href="http://climatetooldev.blogspot.com/"&gt;Brent&lt;/a&gt; and a few grad students and work together on a code sprint to get Flare up and running on both TracSNAP and Breadcrumbs, if possible.&lt;br /&gt;Other possible features:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Improved UI - request suggestions, and make everything scale better with screen size.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Import social network data from existing products that generate it from email logs and the like.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Adapt Anita Sarma's algorithms from the Tesseract project for determining relatedness to this project.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Do you have a suggestion? Leave a comment - thanks!&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-7519930779085777528?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/7519930779085777528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/07/wrapping-up-for-summer.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/7519930779085777528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/7519930779085777528'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/07/wrapping-up-for-summer.html' title='Wrapping up for the summer'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-6340612811534349898</id><published>2009-05-25T12:34:00.000-07:00</published><updated>2010-05-18T10:13:53.034-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Pitching to the Tesseract folks</title><content type='html'>We had a group meeting today to present our feasibility findings and project plans, and it went quite well. It seems that Steve thinks that Anita Sarma and the other developers for &lt;a href="http://www.cs.cmu.edu/%7Eantz/tesseract.html"&gt;Tesseract&lt;/a&gt; would be amenable to letting us use vast swaths of their code as a backend for our social network project, which would simplify our project immensely.&lt;br /&gt;&lt;br /&gt;Tesseract does all the data harvesting and analysis we want to do, but presents the data in a complex, freestanding web app. We'd work on tweaking the analysis portion to work well with the data the Hadley Centre keeps (if neccesary), getting it to run as an unattended part of the project management/repository back end, and pushing the congruence data it generates to extremely simple views within the project's Trac site.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-6340612811534349898?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/6340612811534349898/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/pitching-to-tesseract-folks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/6340612811534349898'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/6340612811534349898'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/pitching-to-tesseract-folks.html' title='Pitching to the Tesseract folks'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-5145824119522209557</id><published>2009-05-21T11:26:00.000-07:00</published><updated>2010-05-18T10:13:53.035-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Mostly minuatae</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Social networking in Trac thoughts&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;I set up toy local Trac and subversion servers to look at what information's available out of the box. It turns out that Trac doesn't really track anything that could be useful for building a graph of straight up social interactions. This suggests some things about how to set up the project - our repository authorship graph maker is a totally separate module from the social network graph maker, both export to a common network representation format, the recommendation engine combines them and spits out information, and the Trac plugin serves pretty views on that info. This is probably the best way to set it up regardless of the social network information source (especially if we want to be able to adapt it to different VCSs and viewers,) but it's good to start thinking about more concrete choices.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;It's my understanding that at the Hadley Centre, they would likely be able to feed all work email history into the social graph maker, and that guided my description of how to create a social graph from yesterday. I'd really like to make a suite of tools that could potentially be useful to other projects, though, so it's worth thinking about what resources others might have available. Many open source projects use mailing lists to communicate, and it makes sense to base a social graph of mailing list participants on who has replied to whom. More on this as I consider it.&lt;/li&gt;&lt;li&gt;How should we track LOC edited? I don't know whether Hadley uses BDB or FSFS for their subversion backend. FSFS introspection looks pretty straightforward: each revision has an author, each revision file has a list of deltas, followed by a list of information about the files revised. It'd probably be better to use existing parsers, even if all we want is linecount/filename.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-5145824119522209557?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/5145824119522209557/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/mostly-minuatae.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/5145824119522209557'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/5145824119522209557'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/mostly-minuatae.html' title='Mostly minuatae'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-5056789009309195672</id><published>2009-05-19T11:32:00.000-07:00</published><updated>2010-05-18T10:13:53.035-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Social networking on Trac: pinning down the specifics</title><content type='html'>I'm working with Ainsley Lawson now, and I'll defer to her &lt;a href="http://individual.utoronto.ca/ainsley/2009/05/hi.html"&gt;excellent post&lt;/a&gt; for a summary of the purpose and evaluation issues she's been researching. Speaking with her about the project gave me a much clearer idea of the specifics of the network we'll be building, so here goes the clearest summary I can make so far:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;To build the graphs:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;b&gt;Create a social relationship graph.&lt;/b&gt; &lt;br /&gt;&lt;br /&gt;Look at email to: and from: fields in the tracked communications and give each pair of people a relationship point for each time one emails the other. Use the relationship points to determine strength of connection in a relationship graph.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Create a code relatedness graph.&lt;br /&gt;&lt;br /&gt;&lt;/b&gt;For each pair of code modules, give them a relatedness point for each time they've been checked in at the same time. This code relatedness thing could get much more complex, but I understand there's a lot of source visualization software out there that's already solved these problems, so we could look at them.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Create a module-by-module expertise listing.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For each code module, look at the subversion history and record the  number of lines of code each distinct author has added, changed, and deleted over the life of the module (LOC edited).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Created a shared authorship graph.&lt;/b&gt; This one's still very rough&lt;br /&gt;&lt;ul&gt;&lt;li&gt;For each pair of people, for each code module, give them min(A's LOC edited, B's LOC edited) shared authorship points.&lt;/li&gt;&lt;li&gt;For each pair of people, for each pair of related code modules, give them (min(A's LOC edited in both, B's LOC edited in both)*relatedness/something) shared authorship points.&lt;/li&gt;&lt;li&gt; Rationale: two heavy editors should get a higher rating than one heavy editor and one light editor, hence the min() construction. &lt;/li&gt;&lt;li&gt;Edits in related modules should count for less than edits in the same module, hence the "/something," denominator probably to be determined by dumb tweaking until it lines up with results of surveying the coders about their network or something.&lt;/li&gt;&lt;li&gt;Total shared authorship points between each pair of people is strength of connection in the graph&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-weight: bold;"&gt;So what do we do with them?&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The primary purpose would be to decide on a threshold difference between relationship points and shared authorship points at which we'd consider a pair of people not to be communicating effectively. If Alice and Bob have 2000 authorship points but only 500 relationship points, we would add them to each other's recommended collaborators feed, available as a widget down the side of the Trac project home page with a link to one another's emails or something.&lt;br /&gt;&lt;br /&gt;Other possibilities:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;People can input the name of a module and get back a list of the experts on that module (determined by LOC edited), and maybe a list of related module expertise search links.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;To really reach, the above could be smarter, perhaps. If I'm writing an in-trac email or bug report that mentions modules by name, it could automatically suggest additional people to copy the ticket to.&lt;/li&gt;&lt;li&gt;You could have a list of experts in modules you've recently checked in as a quick-contact box (with manual add and stickying people allowed).&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Managers can see a visualization of discrepancies between the social and shared authorship graphs to help diagnose organizational inefficiencies.&lt;/li&gt;&lt;li&gt;When Bob shows up on Alice's collaborators feed, she can click "Who's Bob?" and see a graph of of the social network with paths between her and Bob highlighted.&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-weight: bold;"&gt;Things to consider&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;Should expertise slowly expire? It could make sense for experience within the last year to count more than experience from several years back. This would mean counting expertise points as LOC edited as a function of time - not hard to do since we'll be getting our info from diffs anyways, but it stinks of unnecessary complexity.&lt;/li&gt;&lt;li&gt;Should we allow for diff-by-diff updates of the graphs, or assume it'll just be fully rebuilt once a week or whatever?  Probably the latter to start off, until we have an idea of just how big the organization is.&lt;/li&gt;&lt;li&gt;Must make sure to keep in mind that we're doing all this fancy footwork in order to deliver a final product that's extremely simple so people might actually use it. Other social network graphing solutions exist, we need to focus on making ours simple and directed. The recommended collaborators feature fits, but not all of the others do.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Note: Thanks to Ainsley for terminology correction, and please see &lt;a href="http://individual.utoronto.ca/ainsley/summer09.html"&gt;her similar post&lt;/a&gt; for more information on these ideas.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-5056789009309195672?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/5056789009309195672/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/social-networking-on-trac-pinning-down.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/5056789009309195672'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/5056789009309195672'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/social-networking-on-trac-pinning-down.html' title='Social networking on Trac: pinning down the specifics'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-1060661852418761255</id><published>2009-05-15T12:43:00.000-07:00</published><updated>2010-05-18T10:13:53.035-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Social Network project as Trac plugin</title><content type='html'>I've been thinking a lot about how I'd go about making the social network visualization idea into a useful piece of software. I'm thinking it would be a plugin for Trac that would have a run-once analysis of the database (using one of the many free tools for generating social network data from emails) that spits out a social network graph accessible as a tab or subpage of the Trac web interface. It would then query the database daily for new information and add that in, though the complexity of adding new data to an existing graph might outweigh the benefits of not duplicating analysis if the existing software doesn't support that.&lt;br /&gt;&lt;br /&gt;Now, a pretty graph is not too useful, so what else could you do? Well, there's the original plan we had of generating a code authorship network as well, overlaying them, and identifying some discrepancies as inefficiencies in the project that can be fixed by introducing people to one another or even restructuring teams. That sounds hard.&lt;br /&gt;&lt;br /&gt;Even harder would be some sort of semantic analysis - I have a word cloud culled from emails and tickets relating to each person, and when I submit a ticket or send an email, it suggests more people to add to the recipient field based on keywords I've just typed.&lt;br /&gt;&lt;br /&gt;Hmm, so I guess where I'm at is that I can see how to set up the basic system, but I'm not sure whether I can get more than one dataset so we can have comparisons and recommendations, rather than just straight up visualization of what's already happening. So, off to research! I'll check out the free tools on Wikipedia's page of &lt;a href="http://en.wikipedia.org/wiki/Social_network_analysis_software"&gt;social network analysis software&lt;/a&gt;, and start a search for what's been done in the way of repository introspection.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-1060661852418761255?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/1060661852418761255/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/social-network-project-as-trac-plugin.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/1060661852418761255'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/1060661852418761255'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/social-network-project-as-trac-plugin.html' title='Social Network project as Trac plugin'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-1907193584900758722</id><published>2009-05-15T06:43:00.000-07:00</published><updated>2010-05-18T10:13:53.036-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Social network analysis</title><content type='html'>We had a meeting with Steve to clear up the scope of projects we're investigating and assigned research.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Social Network Analysis from Project Management Data&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is what I'll be investigating for the next week. So far, I haven't found any closely similar projects, but the field itself is daunting. "Who should fix this bug," discussed in a previous post, was an ambitious analysis tool of bug tracking information with a smaller results scope and a better sense of what would constitute success, and they got only so-so results out of their project. &lt;a href="http://en.wikipedia.org/wiki/Social_network"&gt;Wikipedia's coverage &lt;/a&gt;of building social network graphs is making my head explode. It's a lot to take in, so I'll try to list issues to investigate here:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Where can I get input data? Ideally, I'd grab the full backend database for an instantiation of a Trac variant supporting a real, somewhat long-lived and complex project. I'm told I should ask Greg Wilson and David Wolover about getting DrProject history.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Once I have data and have processed it, what do I plan on doing with it? How would I test my results? The bug assignment team could compare predicted bugfix assignees to who actually closed the ticket, what's my metric? Would a comparison to some sort of aggregated graph of contact like from &lt;a href="http://code.google.com/apis/socialgraph/"&gt;Google's social graphing results&lt;/a&gt; be fruitful? It's unlikely, since we're ranking social contact within a work environment, while most social networking data online is voluntary. Maybe some sort of survey set up for participants to rate their working relationships with one another? This seems like the best route, but I'd have to set it up ahead of time so as not to fall into post-hoc analysis trap.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;What about the graph itself? Should connections be directionally weighted (I think that's the term) that is, if everyone contacts the intern to assign her small tasks but she usually only contacts her direct supervisor, should we keep track of the distinction or just collapse it into "has contact with many people"? Should we count mentions of each other's names in communication? Changes in assigned-to status from A to B as a link between them? Actual emails? Should some links count more than others? By how much? What sort of crazy voodoo could possibly guide my choice there? I think one thing to do would be to construct different graphs for different contact types, with the ability to overlay/combine them later. Another possibility is to take a page from how these scientists run their models and gather survey results first, then run experiments on our program to change weightings to get it to closely match the survey results.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;What sort of out-of-the-box solutions are available to me for visualizing social networks? What about for graphs like this in general?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Should I be planning on making something that's specifically suited to their team? Or a more general tool?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Apparently people are trying for &lt;a href="http://gmpg.org/xfn/"&gt;an open standard&lt;/a&gt; on disambiguating social links. They're  kind of cool, and could be useful for a variety of our projects here.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-1907193584900758722?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/1907193584900758722/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/social-network-analysis.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/1907193584900758722'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/1907193584900758722'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/social-network-analysis.html' title='Social network analysis'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-6296040466552062145</id><published>2009-05-12T12:53:00.000-07:00</published><updated>2010-05-18T10:13:53.036-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;Reflections on the research paper alerts project&lt;/span&gt;&lt;br /&gt;Because an attempt to create or support a large-scale crawler would be madness, I figure we'd use an existing search service to find new research papers based on users' queries. I'm not sure, however, what would be accessible to us.&lt;br /&gt;&lt;br /&gt;We might qualify for access to &lt;a href="http://research.google.com/university/search/"&gt;google research&lt;/a&gt;, but it would tightly limit what we could do with the project at the end, possibly making our results useless unless they're adopted by a research paper search company. The google search API is probably largely useless, as results are limited to 64 entries and, moreover, the terms require that the search component not comprise the core of your app or webpage.&lt;br /&gt;&lt;br /&gt;Scraping search results from a free or pay service is almost certainly out of the question. I'm pretty excited about this project as a practical one that's within my abilities once the search source is figured out, though. There are a few services out who seem to be using google scholar results, maybe it's easier than it looks; see &lt;a href="http://www.harzing.com/pop.htm"&gt;Publish or Perish&lt;/a&gt; - I don't know how theyse guys are licensed - and &lt;a href="http://pubfeed.cs.toronto.edu/"&gt;Pubfeed&lt;/a&gt; - Maria reports insufficient results on this one, but it's a local project, so I'll ask around. &lt;a href="http://hublog.hubmed.org/archives/001002.html"&gt;This 'touchgraph'&lt;/a&gt; does it an interesting way: it's a bookmarklet, so they don't need to return google search results elsewhere. Not quite applicable, but it's getting me thinking about alternate ways of doing this.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;More readings&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.toronto.edu/%7Esme/papers/2008/CiSE-FCMpaper.pdf"&gt;Configuration      Management for Large-Scale Scientific Computing at the UK Met Office&lt;/a&gt;&lt;br /&gt;A description of developing and deploying a new content management system for the research group. I have a slightly better handle on their current processes and information that'll be available to us. For instance, much of the old version history was imported when they moved to subversion a few years ago. The key takeaway for me was how much support and customization was required to get them to adopt a new system. Any tools we build will have to be extremely easy to use with obvious and immediate benefit if they're to be useful. Simplicity will be the byword.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.toronto.edu/%7Egvwilson/articles/amsci-swc-2005.pdf"&gt;Where’s the Real Bottleneck in Scientific Computing?&lt;/a&gt; and &lt;a href="http://www.cs.toronto.edu/%7Egvwilson/articles/cise-swc-2006.pdf"&gt;Software Carpentry&lt;/a&gt;&lt;br /&gt;Quick reads on the basics computational scientists should be taught. Basically covers the material in CSC108 and CSC148 from a slightly different perspective.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://portal.acm.org/citation.cfm?id=1248886"&gt;Software Development Environments for Scientific and Engineering Software: A Series of Case Studies&lt;/a&gt;&lt;br /&gt;Gives some insight as to how researchers come to conclusions about software engineering, but not really worth the read. Skip to section 5 for conclusions about how large scientific computing teams work.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.ubc.ca/labs/spl/projects/bugTriage/papers/icse2006.pdf"&gt;Who should fix this bug?&lt;/a&gt;&lt;br /&gt;An extremely interesting look at a project to cull information from bug reports and CVS repositories for Eclipse and Mozilla for automatic recommendations as to who should be assigned new bugs. It looks to me like what they worked on was way out of scope for the time and expertise our team has available, but it's from a few years back, so there may be further projects and tools available now that we could model our attempts at developing social network models from repository information on. Even if we don't use anything like this, it's an illuminating look at the complexities involved in developing and testing an aggregator from this sort of data.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://jonudell.net/GroupwareReport.html"&gt;Internet Groupware for Scientific Collaboration&lt;/a&gt;&lt;br /&gt;An overview of group collaboration software as of 2000. I found this really useful as an introduction to the culture of the discourse; some of the comments made by Steve and Greg make more sense in the context of the goals and challenges of group collaboration online here. The much more recent post &lt;a href="http://blog.openwetware.org/scienceintheopen/2009/04/26/now-thats-what-i-call-social-networking/"&gt;Now that’s what I call social networking…&lt;/a&gt; kind of helped tie it in to current technology trends for me.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.djangobook.com/"&gt;The Django Book&lt;/a&gt;&lt;br /&gt;I'm coming around. It feels like slower going than learning Rails because they focus heavily on making explicit things that just kind of happened in Rails. I really do appreciate that level of control, however, and I think I'm going to enjoy working in it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-6296040466552062145?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/6296040466552062145/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/more-readings-configuration-management.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/6296040466552062145'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/6296040466552062145'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/more-readings-configuration-management.html' title=''/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-1522695519177184090</id><published>2009-05-12T11:22:00.000-07:00</published><updated>2010-05-18T10:13:53.036-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Second day's impressions</title><content type='html'>So right now we're all tasked with getting comfortable with Django on the assumption that we'll be doing a fair amount of web programming. I worked in Ruby on Rails last summer, so I'm theoretically pointman on this, but we haven't really gotten far enough for my familiarity with web frameworks in general to come into play.&lt;br /&gt;&lt;br /&gt;I'm taking the modest head start I've got as license to spend some extra time reading up on the problem domain and thinking about the possible projects suggested by Steve Easterbrook and Greg Wilson.&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;b&gt;Research alerts with a social component&lt;/b&gt; &lt;/p&gt;&lt;p&gt;People set up queries and receive alerts when relevant papers are released. Analysis of queries and/or results to suggest contacts with people researching similar topics. &lt;/p&gt; &lt;ul&gt;&lt;li&gt; It would be nice to keep things loosely coupled so we can have a central place for queries with the ability for people to add new frontends for different places to use it. For example: I have a widget on my blog that informs me and others of what's been recently recommended. It suggests I talk to B who uses the same service through a facebook app and C who uses it through a dedicated website and D who uses a desktop app that automatically harvests papers searched for (like last.fm? - this one would be hard and way off spec, but fun for a future project idea) &lt;/li&gt;&lt;li&gt; Would want to have 'roles' available - I might not want to restrict my results to people who have the same two sub specialties as I, &lt;/li&gt;&lt;li&gt; Are we looking at piggybacking on existing search engines? That would make sense, but we'd need to ensure we're respecting fair use. &lt;/li&gt;&lt;/ul&gt; &lt;p&gt;&lt;b&gt;Electronic lab book&lt;/b&gt; &lt;/p&gt;&lt;p&gt;These researchers are using basic wikis to keep research notes. How can we make this more useful? Can we mostly replicate the function of paper lab books so that research processes can be more easily shared? &lt;/p&gt; &lt;ul&gt;&lt;li&gt;I need to see some paper lab books or grill a scientist. Really, I have no idea what would be useful.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The most basic thing would be a suite of wiki templates. They're probably already using something like this.&lt;/li&gt;&lt;li&gt;Is it realistic to consider whether they might move to tablet computers + handwriting recognition office software soon, letting them simply use the screen as they've been using paper? I've read that the technology's supposed to get much cheaper soon, and that windows 7 is slated for inbuilt support, so maybe that'll just happen for them as it gets more broadly adopted. "Someone else will probably fix it in the future" isn't a very good plan, though&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-weight: bold;"&gt;More non-specific social networking stuff&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Construct a graph of social interactions by mining old emails, forums, agendas, and team lists. Make another of code dependencies related to authorship information. Compare the graphs, with an eye to determining whether and which discrepancies are evidence of communication inefficiencies.&lt;br /&gt;&lt;br /&gt;This would be an interesting project, but making it reasonably transferrable to analysis of information from other organizations sounds like a beast of a job. One you've got social network graphs from other sources such as the research alerts project, however, they shouldn't be too tough to combine, the trick would be to figure out how significant the differences are and whether you're generating useful comparisons. The data can then be used for a variety of tools. As you can probably tell, I'm a little hazy on this whole process, but reading up. For a much more cogent explanation, see &lt;a href="http://www.easterbrook.ca/steve/?p=430"&gt;this post&lt;/a&gt; by Steve.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Ways to easily add visualizations of data to papers and websites&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;It looks like there's already a lot of quality work going on in standards for embedding the code that generates the visualization into the research paper itself. I'm not quite sure where we could help but the idea's been floateding around, so I'm leaving it here as a reminder to ask around.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;What I'm reading&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cs.toronto.edu/%7Esme/papers/2008/Easterbrook-Johns-2008.pdf"&gt;Engineering the Software for Understanding Climate Change&lt;/a&gt;&lt;br /&gt;An overview of working environment of the researchers we'll be trying to help. Focuses on the differences between their processes and ones we're more used to in software development, and on  challenges to productivity that could be solved by software engineering tools and practices.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.djangobook.com/"&gt;The Django Book&lt;/a&gt;&lt;br /&gt;Not excited yet. Must press on.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-1522695519177184090?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/1522695519177184090/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/second-days-impressions.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/1522695519177184090'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/1522695519177184090'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/second-days-impressions.html' title='Second day&apos;s impressions'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4231653014599940335.post-2772501482255968028</id><published>2009-05-12T10:51:00.000-07:00</published><updated>2010-05-18T10:13:53.036-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='TracSNAP'/><title type='text'>Beginnings</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Who?&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;I'm an undergraduate student at the University of Toronto working on software development support tools for climate scientists, funded by a national science and engineering research council undergraduate summer research award (NSERC-USRA). I'm working with four other students under the supervision of &lt;a href="http://www.easterbrook.ca/steve/"&gt;Steve Easterbrook&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;What?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So far, the only constraint is that we work on developing tools that might be useful to the researchers at the &lt;a href="http://www.metoffice.gov.uk/climatechange/"&gt;Met Office Hadley Centre&lt;/a&gt; and similar departments around the world. I'm just starting to learn about how they work. These researchers develop complex software models of climate systems and run them as experiments, comparing results with other projections and real world observation. They work in Fortran on code that has components still in use decades after their original conception, and have recently started collaborating more heavily with other research groups abroad.&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-weight: bold;"&gt;Why?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I'm starting this blog to help me organize my thoughts for my summer research position. I figure I'll discuss what I'm working on now and where I think we should be going in the future, plus any difficulties I'm having. I'll toss up what I've been reading with links and summaries to jog my memory, too. I hope it'll be useful to my teammates to be able to see what I'm working on - and to correct me when I'm off base.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4231653014599940335-2772501482255968028?l=sarahestrong.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sarahestrong.blogspot.com/feeds/2772501482255968028/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/beginnings.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/2772501482255968028'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4231653014599940335/posts/default/2772501482255968028'/><link rel='alternate' type='text/html' href='http://sarahestrong.blogspot.com/2009/05/beginnings.html' title='Beginnings'/><author><name>Sarah Strong</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
