Presented by Alejandro Cadavid to WinLibre
Alejandro Cadavid López
Medellín, Antioquia, Colombia.
Timezone: GMT -5
Usual Nickname: acadavid
Mail/Gtalk: acadavid@gmail.com
Skype: alejandro.cadavid
MSN: ale_cad_lop@hotmail.com
Synopsis
WinLibre Package Manager is an application to serve Open Source software to Windows users. The basic idea is to allow Windows users to get a catalog of the available applications and download and install them if they want to. This proposal shows my ideas on repository server that will hold the packages description.
Benefits to the WinLibre Community
Windows users sometimes don't even know about great open source packages around, so having an application like WinLibre will make a great difference to the Windows-Open-Source software. It's not a secret that most of the Open source software for Windows is somewhat hidden in the IRC channels, discussion lists, and geeks websites, and sometimes those programs are not easy to install. There is a good chance that Open source software would be more reachable for human beings.
Deliverables
* Package parser. Depending on the selected package file
standard, a parser will be created (or a library reused [1] [2] [3]).
This will allow us to get global and individual information of the
available packages. In addition to the parser, a way to store those
files will be built (Database, or server direct upload, or mirroring,
etc).
* Repository server with GET method working. Package
manager will be able to download a global description of all the files
available in the repository so they can be downloaded.
*
Repository with POST method working. Package Creator or developers
through a web page (a simplified version of the release Website) will
be able to upload a XML file (JSON, YAML, XPKG or HTML for example). In
order to do this, developer should be identified by the system (OpenID
account, Google Account or our own authentication system), so this
deliverable will include the identification system as well.
*
Repository with DELETE, UPDATE and any other "auxiliary" methods
needed. This will almost complete the whole system, allowing users to
fetch information about the packages, and allowing developers to post
and modify each package.
* Repository mirroring. As there will
be backup servers for the repository, an application to synchronize
with this mirrors should be built. It could be created as a Python
script and added to a cron process in the backup servers. The
application will basically compare the versions of the files in the
live server and the backup ones, and it then will download all the
out-dated packages to the backup server. This could be done daily or
twice a day and at low-traffic hours to keep the servers away from
overload. Another way to do this is allowing the live server to do HTTP
requests to the backup servers in order to keep files updated, but this
could cause overload in the live server.
* Web Portal. After
everything in the backend is implemented, a Website will be created so
users can browse through package categories, download package
descriptors, or download the software itself. It will also allow
developers (authenticated) to create new packages or update existing
ones. Users will be able to rank, comment and download packages. Other
features could be added on the way.
Project Details
My development process
To get my ideas clear, I first write them on paper, and I try to
solve the problem using concepts. I research by myself and ask for
recommendations on the issue if necessary. I write a possible solution
on paper and then I like discussing them to get feedback. When I feel
that I'm sure of what's going to be done, I write some pseudo-code and
if no issue arise, I go into implementation. I'll keep all the process
into documents so there is useful information after the project. I ask
for code review if necessary, and then commit. I like documenting a
lot, and actually i prefer writing the documentation for each method in
order to make a clearer idea of what is it going to do. Then i write
it's code. Basically I can work comfortably with SVN and Git, but I
don't have any problem on learning to use any other cvs. I would
recommend Review Board to keep track of the code and a Wiki to keep all
the knowledge gathered from the development process, If I find another
tool of interest, I'd recommend it. I'm willing to learn to use any
other tool that you consider it's a good idea to use along the project.
I like committing often, at least once daily, and of course, keeping
the weekly deliver promise to keep myself motived and keep worries away
from the project.
Package Parser
A Data model is already defined. With this model, a package
parser will be easy to build. It will depend on what kind of XML file
we will be using. The parser could be used in each part of the project
(Repository server, package manager and package creator). Parser will
give us methods to access each part of the data model from the package
descriptor.
Data handling on repository server.
REST architecture [4] [5] will be used along the project. REST
uses explicit HTTP protocol methods, so we can GET access to resources,
POST them, PUT (Modify) them or DELETE them. Each package will have its
URI. Identifiers for packages will be defined depending on the
application standards used. A general idea of this would be something
like this:
http://repositoryserver/packages/category/appidenfier
Assuming
that packages will be classified under certain categories, for example.
Under that URI, methods like GET can be executed, so it can retrieve
the package descriptor for example, or just parts of the data model.
Django can handle HTTP request so there is no problem with this [6].
GET Method
There are two different situations here.
- Global descriptions
A
compress XML could be generated by the server each time a package is
added, modified or deleted. This file will have a basic description of
each application available on the repository (There is already a data
model for this file too). It will have the URI of the application,
allow the Package manager to execute HTTP methods on the repository for
a selected application. Creating a XML file could be inefficient, so a
more efficient method or a better way to do global update could be
investigated.
- Application descriptor
The
package manager will request information for certain package using GET
method. The repository will answer with the package descriptor of the
application. GET method could even ask just for single information of
the package, like version, so there is no need to do a full XML file
download. The URI for each package will allow us to access this
resources in a easy and standardized way.
POST, PUT and DELETE Method.
All this methods are standard to HTTP protocol. The way to use
them should be defined depending on the kind of application descriptor
that will be used. So we can modify a package for example using a
request like this:
PUT /application/appidentifier HTTP
Host: repositoryserver
Content'Type: application/xml
<package>
<version>2.5</version>
</package>
DELETE
methods would delete the package descriptor from the repository. As
there should be some way to authenticate users, we could use
django-OpenID [7] and Google APIs [8] or create or own library to do
this.
Backup server applications:
Depending on how files are going to be stored in the live
server, backup server will check the versions in the live server and
will download the new and updated ones. If the live server goes down,
the mirror server could check the state of it and take its place if its
necessary. It's something complex to do, so it will require further
research. But at least, mirror server could work as a backup servers
for the data, and to keep it updated in case of an emergency.
Web Portal
After every HTTP request is handled by the repository, the web portal will be created. The basic features will be:
* Application category browsing
* Application search (By name, description, etc)
* Application description (Name, version, download URL, etc)
* Application ranking
* Comments on Applications
* Application Screenshots upload.
So many other features could be added, it's just a matter of
deciding which will benefit the users. We could ask for users feedback
on this.
Project Schedule
This proposal is written to invest the entire GSoC time on
it. I will start working on the project around April 20th, if accepted.
From April 20th to May 23th I'll be working with WinLibre team (and the
other students accepted) to get concrete ideas on the files
specification and improving my ideas on the repository server. I'll
take some time to relearn Python and I should learn to use Django.
From
May 23th to July 13th I will be working from 8 to 10 hours a day. I
expect to be working at least 40 hours per week on the project. From
July 13th to August 24th I will be working 6 hours a day. That's
because my University starts again in July 13th, but this is not a big
issue because the first classes month is not that hard, and I'll have
enough time to work on the project, and I expect to have a big part of
it done for that date. I will be very pleased to keep working on the
project after GSoC is over. It's a very exciting project, quite
challenging.
Week 1: XML parser building
Week 2: Parser testing with pre-built packages and bugs fixing.
Week 3: Descriptors files storage method
Week 4: Global descriptor generator.
Week 5: Repository with GET Method.
Week 6: Authentication Methods development.
Week 7: Repository with POST Method.
Week 8: Repository with PUT and DELETE methods.
Week 9: Backup servers application
Week 10: Testing repository with Package manager and package creator.
Week 10: WebPortal development (Category browsing, application description)
Week 11: WebPortal development (Searching, ranking, comments and other features)
Week 12: Testing weeek and bugs fixing.
Who am I?
I was born in Medellín, Colombia. On March 26th 1989. So right
know I'm 20 years old. I study Systems Engineering (Similar curriculum
to Computer Science) at Universidad EAFIT. This is my first time at
GsOC, and the first time working on an Open Source project as well. I'm
interested in programming, mathematics and music. I'm part of a group
at my University where we develop techniques to solve programming
problems (ACM, SPOJ.pl, TopCoder, etc). I'm a very open minded person.
I like sharing ideas and i like discussing them. I usually work by my
own, I find my own ways to solve issues things before asking. If it's
getting so hard or it's taking more than the planned time, I'll ask for
help. Yet, I like reporting myself to get feedback and to feel
motivated. That's why I pretend to make a weekly deliver, that keeps me
focused and self-motived to keep working.
My Experience:
I mainly use Linux as my Operating System. I use Windows as
well. My experience with programming languages is not so wide, I have
used Java, C++ and Python in my University projects. I made a Python
project with a classmate in my University called Rigo [9] (It was
basically a code parser and executer.. it was something like Karel but
lighter). I'm showing this code just as a prove that I've coded in
Python but it's not actually the best code to show. My coding practices
there, are really awful. That's not the kind of code I'd like to
develop for WinLibre.
I don't have real experience with Django,
but I have done some development with Ruby on Rails, so, i think that
getting the idea on Django wont be that hard. Anyway i'll do my
homework on learning it, and relearning Python as well. I will research
on everything that is needed on the project.
[1] http://pyyaml.org/wiki/PyYAML
[2] http://pypi.python.org/pypi/python-json/3.4
[3] http://pyxml.sourceforge.net
[4] http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
[5] http://en.wikipedia.org/wiki/Representational_State_Transfer
[6] http://docs.djangoproject.com/en/dev/ref/request-response/?from=olddocs
[7] http://code.google.com/p/django-openid/
[8] http://code.google.com/apis/accounts/docs/AuthForWebApps.html
[9] http://acadavid.nfshost.com/uploads/rigo.tar.gz