Resources‎ > ‎Python Resources‎ > ‎

Integrating Python with OpenOffice.org

What if we can just move the data into spreadsheets?
The spreadsheet is the bread and butter of financial wizards. If you could just get the data into a spreadsheet, you could be empowered to explore the data in powerful ways. Another angle is that if you can move your data to a spreadsheet, you do not have to write a GUI. It is already written for you. Why reinvent the wheel?
If only it weren't so hard to get started! The problem is that OpenOffice.org is very powerful and all its features are exposed to the developer. Maybe with OpenOffice.org 3, extending OpenOffice.org of simple, common tasks will become even easier. For now, we will accept the 'getting started' code in good faith.
Getting Started
The first step is that OpenOffice.org must be started with the option to listen to requests from other programs. So, on the command line, execute the following:
    ooffice -invisible '-accept=socket,host=localhost,port=2002;urp;'
Now, we want to open a spreadsheet. We will need to connect to the OpenOffice.org we started and send a request to open a spreadsheet. The 'invisible' option in starting OpenOffice.org and the 'Hidden' option in the following code snippet ensure that the work is done in the background (neither is essential). Insert the following code in a file called open_office.py:

import uno
from com.sun.star.beans import PropertyValue
def oo_calc(uri='private:factory/scalc'): #if no uri is passed, open a blank calc document
    localContext = uno.getComponentContext()
    resolver = localContext.ServiceManage.CreateInstanceWithContext('com.sun.star.bridge.UnoUrlResolver',localContext)
    ctx = resolver.resolve('uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext')
    p = PropertyValue()
    p.Name = 'Hidden'
    p.Value = True
    properties = (p,)
    desktop = ctx.ServiceManager.createInstance('com.sun.star.frame.Desktop')
    return desktop.loadComponentFromURL(uri,'_blank',0,properties)

    Since everything is invisible, how du you know that you have succeeded? Well, change the 'p.Value' to 'False' while you are testing. Now, start the python interpreter and run the following commands:

    >> from open_office import *
    >> calc = oo_calc()

    If the OpenOffice.org program is running and all is well, you will have a new Calc document.

Working with the spreadsheet
A new Calc document will consist of three sheets, by default. You will load the data of friends in the first sheet an the 'loaned items' data in the second sheet. Before doing that, add the functions to select a sheet. So, add the following code in the open_office.py file.

    def get_sheet(index, workbook):
        """Return the desired sheet. If the index is wrong, return the 0th sheet"""
        sheets = workbook.getSheets()
        if index >= sheets.getCount():
            print "Warning: Index out of range. Returning Ist Sheet"
            index = 0
        return sheets.getByIndex(index)

    You can now test this code. Start the OpenOffice with options as above. Now, start the python interpreter and give the commands:
   
    >> from open_office import *
    >> calc = oo_cale()
    >> sheet = get_sheet(0,calc)
    >> sheet.setName('Friends')
    >>dir(sheet)

    The sheet would now have a new name. The dir command will give you information about all teh attributes and methods available for the 'sheet' object As we had mentioned above, there is a lot of power available to the programmer.
    You are interested in changing the values of the cells in the sheet. The getCellByPosition method is promising. Google helps in finding the details of the parameters if you search form 'pyuno getCellByPostion'.

    >> cell = sheet.getCellByPosition(1,2)
    >> dir(cell)
   
    Now, you can see the various options available with the cell object. The setString and setValue methods look promising.
   
    >> cell.setString('The Price')
    >> cell2 = sheet.getCellByPosition(1,3)
    >> cell2.setValue(99.95)

    You will discover that the names are meaningful -- one method is for storing text and the other for numeric values. Since Python uses dynamic typing, you can now add a convenience function in open_office.py to store data in a cell, depending on the type of data.

    def set_cell(sheet, row, col, data):
        xCell = sheet.getCellBy Position(col,row)
        if type(data is (type(str()),type(unicode())): #String/Unicode type?
            xCell.setString(data)
        else:
            #assume a numeric value
            try:
                xCell.setValue(data)
            except: #Ignore cells with invalid data
                print 'Invalid data',data
   
    Currently, you can assume that the data is either numeric or string. OpenOffice.org works with Unicode strings. So, you will need to handles Unicode Strings as well. You will also need to handle exceptions because some fields may have no values. You can test your code as before:

    >> from open_office import *
    >> calc = oo_calc()
    >> sheet = get_sheet(0,calc)
    >> sheet.setName('Active')
    >> set_cell(sheet,0,0,'This is it')
    >> set_cell(sheet,1,1,"99.99")
    >> set_cell(sheet,1,2,99.99)

    You will want to save the work. So, you need to add a function to save the file and to allow overwriting an existing file.

    def save_document(calc,filename,overwrite):
        p = PropertyValue()
        p.Name = 'Overwrite'
        p.Value = overwrite
        properties = (p,)
        calc.storeAsURL('file://'+filename,properties)

    You can test the method loading data into the spreadsheet as above, saving the file in your home folder and closing (disposing off) the document.

    >> save_document(calc,'~/workbook.ods',True)
    >> calc.dispose()

    If your repeat this test, the file will be replaced.

Database to the spreadsheet
You should reuse the loans_v2.py created earlier. The minimal code needed for this exercise is given below:

    import shelve
   
    class friends:
        def __init__(self,name,phone=None,email=None):
            self.name = name
            self.phone = phone
            self.email = email
   
    class item:
        def __init__(self, item_name, borrower = None):
            self.item_name = item_name
            self.borrower = borrower
       
    class loans:
        def __init__(self, friends_db, items_db):
            self.friends = shelve.open(friends_db)
            self.items = shelve.open(items_db)
        def close(self):
            self.friends.close()
            self.items.close()

    You will make use of the two files and write your application code in db2oo.py. The friends.db and items.db should contain the data as discussed in an earlier article. If they do not exist, your spreadsheet will contain only column headings and you can add the data. You will need to create the loans object and the Calc document:
   
    >> from open_office import *
    >> from loans_v2 import *
    >> my_loans = loans('friends.db','items.db')
    >> calc = oo_calc()

    You can add a utility function to insert column values into a row:

    def add_row(sheet, row, row_data):
        col = 0
        for data in row_data:
            set_cell(sheet, row, col, data)
            col += 1

    Now, you can insert the data from the friends database into the first sheet:

    >> sheet = get_sheet(0,calc)
    >> sheet.setName('Friends')
    >> add_row(sheet, 0,['key','name','phone','email']) #column headings
    >> row = 1
    >> for key in my_loans.friends:
        f = my_loans.friends(key)
        add_row(sheet,row, [key,f.name, f.phone,f.email]) # add the key and attributes
    >> row += 1

    It would be nice to have column widths of a proper size. So, let us add a utility function, which takes the sheet and the number of columns as parameters, and call it:
   
    def optimise_column_widths(sheet,n):
        columns = sheet.getColumns()
        for col in range(n):
            column = columns.getByIndex(col)
            column.setPropertyValue('OptimalWidth', True)
    optimise_column_widths(sheet,4)

Now, we repeat similar code for the items loaned.
   
    sheet = get_sheet(1,calc)
    sheet.setName('Items')
    add_row(sheet, 0,('code','Item Name', 'Borrower'))
    row = 1
    for key in my_loans.items:
        it = my_loans.items(key)
        add_row(sheet, row,[key, it.item_name,it.borrower])
        row += 1
    optimise_column_widths(sheet,3)

Finally, we save the file and close the databases.

    save_document(calc, '~/workbook.ods',True)
    calc.dispose()
    my_loans.close()

    Now, you can modify and change the data in the spreadsheet by using OpenOffice.org and when satisfied, reverse the process.

From spreadsheet to the databases
You can start with making changes to the open_office.py file. You will need a function to get the data from a cell.

    def get_cess9sheet,row,col):
        EMPTY= uno.Enum("com.sun.star.table.CellContenType", "EMPTY")
        TEXT = uno.Enum("com.sun.star.table.CellContentType","TEXT")
        FORMULA = uno.Enum("com.sun.star.table.CellContentType","FORMULA")
        VALUE = uno.Enum("com.sun.star.table.CellContentType","VALUE")
        xcell = sheet.getCellByPosition(col,row)
        datatype = xcell.getType()
        if datatype = EMPTY:
            return None
        elif datatype == VALUE:
            return xcell.getValue()
        else:
            return xcell.getString()

    The type property of a cell tells you about the contents of a cell and then you can extract the data using the appropriate method. Your function will not handle the formulas properly.
    The interpretation of numeric data is controlled by formatting information -- for example, whether the content of a cell is a date. You need not handle such complexity at this stage.
    Now, you can write the following code in oo2db.py:

    from open_office import *
    from loans_v2 import *
   
    my_loans = loans('friends.db','items.db')
    calc = oo_clac(uri='file://~/workbook.ods')
    sheet = get_sheet(0,calc)
    row = 1
    while True:
        key = get_cell(sheet,row,0)
        if key == None:
            break
        my_loans.friends[str(key)] = friend(get_cell(sheet,row,1),get_cell(sheet,row,2),get_cell(sheet,row,3)
        row += 1
        sheet = get_sheet(1,calc)
        row = 1
        while True:
            key = get_calc(sheet, row,0)
            if key == None:
                break
            my_loans.item[str(key)] = item(get_cell(sheet,row,1),get_cell(sheet,row,2))
            row += 1
        calc.dispose()
        my_loans.close()

    This code will read rows on a sheet form an existing document until it finds an empty key column. The key has to be a string so the Unicode value returned from the spreadsheet is forced to a string.
    You can verify if there is a bug in this code. A row from a database cannot be deleted. If we change the key, and additional row is inserted in the database. If you delete a row from the spreadsheet, the original row still remains in the database.
    The program should also ensure that the data on the second sheet in s consistent with the data on the first sheet present. You can practice making the program better.
    You can learn how to create more complies programmes with OpenOffice.org by using the API documentation at api.openoffice.org along with the PyUNO bridge documentation at uk.openoffice.org/python/python-bridge.html. However, you may find it easier to get a working solution by using Google search with keyword "pyuno" and a suitable keyword from the result of 'dir' on a pyUno object, as mentioned earlier.
    Finally, let's face it: this solution is clumsy. You do not expect a user to strt OpenOffice.org with a shell in network mode. So run the db2oo.py application. After that, modify the spreadsheet document just created. And finally, move the data back into the database using the oo2bd.py application.
    What you need is an application that will manage the flow of the various applications. Python is a great 'glue language'. So, next time, we will look at how to use Python to communicate with the OS and create an integrated workflow.
   
   
Comments