September 2016 – the final tally

Well, I will definitely remember September 2016 for a number of reasons, but one is definitely the amount of travel involved. The story will be much easier if I break this down to numbers:

Morning view, between sea and sky
Morning view, between sea and sky

Nights away from home: 17

Total Distance traveled: 31,444 km (fyi, this almost the circumference of Earth)

  • By airplane: 30,711 km across15 flight
  • By car: 448 km
  • On foot: 193 km (yeap, I did walk that distance)
  • By train: 92 km


  • 8 airports
  • 5 cities
  • 3 countries
  • 2 continents

And all in the span of 30 days. Well, #neveragain  would probably be the most appropriate hashtag for this, but for me it was definitely an experience, and one I plan not to relive in the future… 🙂

Oh and in case you are wondering what would be a fitting end to this travel-esque September? Why, the 5km Thessaloniki Night Half-marathon run of course… 🙂 And yes, I did manage to keep my time under the 30 minute mark thank you very much. #lifeontherun

Teaching my first Software Carpentry Workshop

Group Photo at the finish of the AUTH SWC workshop (Oct 2016)
Group Photo at the finish of the AUTH SWC workshop (Oct 2016)

My first Software Carpentry workshop“. Sounds a bit like a first-grader’s essay. But in all honesty, it was a brand new experience for me. Sure, I have been teaching University lectures and seminars for quite some time, but doing a live-coding session was an experience of its own

All things considered, the workshop was a success (evident also in the group photo here); participants were engaged throughout the two-day workshop, there was continuous feedback and discussions on tough points in the lesson, so everyone left quite happy and (hopefully) more knowledgeable than before.

For me, two things have become even more clear:

  1. Better organization of the workshop in terms of the participants’ level of experience. Sure, very few actually complained for the pace, but I have the impression that we could do a better job if we had focused only on beginner level (for example) or intermediate.
  2. Two days may be convenient in terms of time allocated for any person (given the rather busy schedule of most people now), but it’s a very short time to cover R, Shell and Git. So maybe next time (because there will definitely be a next time) we will aim for three days.

Oh, and a final piece of advice: if you intend to have food or catering at a SWC workshop, ordering pizza may be the best solution…. 🙂

Training for NGS data analysis using Chipster

The story is rather simple. Yesterday, my lab together with the Institute of Applied Biosciences co-organized a training workshop for NGS data analysis. For anyone even remotely engaged in NGS data, the biggest problem in NGS data is usually the computational complexity. In simple words, analyzing tons of data takes a very very long time. Which means that essentially the analysis in performed by people that are familiar with the tools (and their command-line interfaces) that can be used in high end computational systems.
However, this workshop went slightly off the treaded path by (mostly) skipping the command line interface and going directly to the graphical interface of Chipster, developed, maintained and kindly provided by CSC. This “deviation” allowed the participants, who had mainly wet-lab research background, to easily follow the established workflows and pipelines used in NGS data analysis. Moreover, instead of using local computational resources, we launched several Chipster servers through the EGI Federated Cloud. So in one training session, the participants were exposed both to the computational capabilities and infrastructure of EGI, as well as the pipelines used in NGS data analysis. All in all, a very dense 8-hour workshop!
The level of the participants’ experience was also quite diverse, ranging from undergraduate students to faculty members and staff scientists. Despite that though, the workshop was very engaging to all members, a fact clearly seen in the happy faces all around, even when the workshop extended a full hour beyond the expected wrap-up time!
So, the take home message; there is clearly a need (some might consider it a desperate one) for training events in bioinformatics, and especially in Big Data studies such as NGS data analysis. However, such events should not necessarily focus on the tech-savvy user. Or at least, actively encourage the non technical-expert researchers to attend by providing (a) user friendly interfaces, (b) hands-on exercises that feel close to the actual work of the participants, and (c) the time necessary for everyone to keep their own pace.
Finally, I would be remiss if I didn’t thank enough the two people that really supported this workshop: Diego Scardaci from and Kimmo Mattila from CSC, whom I constantly pestered with questions and issues in the past few weeks, and they always had the time and patience to lend me their experience.
Hopefully, there will be follow-up and more specialized workshops. However, if you are interested, the next one will take place at the EGI Community Forum in Bari. So, hope to see you there!

Integrating datasets for bioinformatics

Well, it seems that I have yet another story regarding EGI. Actually make that two; one is a new article on the EGI Inspire Newsletter in collaboration with Rafael Jimenez regarding a joint project between EGI and ELIXIR. The second is this joint project.
A collaboration between ELIXIR and EGI is by itself great news. Personally it means that there will be greater opportunities to find (and probably develop) bioinformatics tools that will also utilize and work with the computational infrastructure of EGI. And with little to no expertise required from the end user; it’s no secret that the average wet-lab researcher is a bit hesitant went it comes down to the “little black window” a.k.a. terminal. 🙂
The joint project I mentioned earlier is one that I am proud of being the coordinator of. It is an EGI Virtual Team project on Integrating Life Science Reference Datasets. Yes, I know it’s a mouthful but it’s really quite simple: instead of having to constantly copy reference datasets (i.e. NR/NT, UniProt, BowTie index files etc) to several computational nodes, leave it to the infrastructure to do it for you.
As a project it just started, but hopefully we’ll have some interesting results in the next 9 months. I’ll keep you posted!

Future opportunities and trends for e-infrastructures and life sciences

Working with friends, beyond being a pleasure, usually bears fruit. Case in study, the article published today on the EGI Inspire Newsletter with the help of close friend and colleague Afonso Duarte.
Beyond being a nice study on the trend of Life Sciences working with e-infrastructures (and Grid/Cloud computing specifically), this article is also an announcement of the Workshop we are organizing in the upcoming EGI Conference in Helsinki. Hope to see you there too! 😉

File Upload using Perl and DANCER

One of the most common functions in a web application is File Upload. The following is a working code that can be used to this end. The prerequisites are only Perl (obviously) and the DANCER framework. Right! Here goes:
post '/upload/:file' => sub {
  my $upload_dir = "/home/fpsom/myApp/UPLOADS";
  my $filename = params->{file};
  my $uploadedFile = upload('file_input_foo');
  debug "My Log 1: " . params->{file};
  debug "My Log 2: " . ref($uploadedFile);
This should be copied in the myApp/lib/ file (check for the corresponding file in your app)
In order to check the functionality of this code, I’ve used cURL as follows, where testUploadFile is an existing file:
curl -i -F file_input_foo=@testUploadFile http://localhost:3000/upload/testUploadFile
The output on the “development dance floor” should be as follows:
[10269]  core @0.000119> request: POST /upload/testUploadFile from in /usr/local/share/perl/5.14.2/Dancer/ l. 56
[10269]  core @0.000413> [hit #4]Trying to match 'POST /upload/testUploadFile' against /^\/upload\/([^\/]+)$/ (generated from '/upload/:file') in /usr/local/share/perl/5.14.2/Dancer/ l. 84
[10269]  core @0.000530> [hit #4]  --> got 1 in /usr/local/share/perl/5.14.2/Dancer/ l. 102
[10269]  core @0.000627> [hit #4]  --> named tokens are: file in /usr/local/share/perl/5.14.2/Dancer/ l. 130
[10269] debug @0.001051> [hit #4]My Log 1: testUploadFile in /home/fpsom/myApp/lib/ l. 21
[10269] debug @0.001145> [hit #4]My Log 2: Dancer::Request::Upload in /home/fpsom/myApp/lib/ l. 22
[10269]  core @0.001446> [hit #4]response: 200 in /usr/local/share/perl/5.14.2/Dancer/ l. 179
And that is that! Hope it helps.

Perl DANCER framework

After quite some time, I had to go back to writing web-services. I can’t say I was really looking forward to it, but don’t get me wrong; I love writing code and I feel quite confident at it. On the other hand, my memories in writing web-services aren’t the most comforting ones. Admittedly, last time was in 2008 and was writing in Java at that time, so I had to juggle with a lot of things to get them working just right, but still…
Anyway, my project now is in Perl so I had to look a little bit around for any existing frameworks that might make my life easier, when I came up with the DANCER framework. And it was a brand new day…
That was two days ago. In this time I’ve been able to setup almost everything writing only a fraction of the code I used to. Now I am working on the particulars of the application behind the web-service. The setup of the service is staightforward; the DANCER dev team actually provides a tool to help along the way. But (as in every transition between languages) there are several bumps along the road after that. So, in order to help anyone interested, I’ll share any “recipes” that I come up along the way.
Good coding to all!