- Published on
Week 35 in review
- Authors
- Name
- James Wilson
- @jwilson92
Now that the summer is (almost) over, this week has predominantly been spent planning for the remaining QTR and the beginning of 2022. With a big focus on PDFx and Addressable and empowering their users.
For product context, see PDFx and ADDRESSABLE.
What did we do this week
PDFx
Improved table extraction
As well as improving the algorithms surrounding the extraction of tables we have also started work on OCR capabilities to empower our PDFx users to extract tables that have either been added to the document as an image or are, ultimately, poorly structured.
The OCR capabilities are still relatively untrained and so we expect OCR will become a capability released as part of our upcoming Canary build.
To empower users we are also working on an updated design for showing the data that has been extracted. The table will aims to be clearer and allow excel like functionality for that crisp editing feeling.
Travelling whilst staying at home
Whilst a lot of us are now able to jump on a plane and travel to designated countries, PDFx is having it's own travel ban lifted as it explores German Exposé documents. This is a part of a PoC with a number of potential clients in the region who require software to empower their data extraction needs.
An update on the Microsoft Connector
As announced in our innaugral Week in Review. We have started seeing our first clients start utiising the connector to forward emails to and have them process automatically saving further time and allowing them to make decisions faster.
ADDRESSABLE
Self service data processing
As part of our commitment to ensure that users are able to process data as quickly as possible we have released our first version of the self service application for data processing using the ADDRESSABLE backbone.
Map Reduce
In last weeks post we mentioned the final work had been completed on our map reduce tasks. Now that this has been completed our data science team have started utilising the power of D3.js to visualise this data. We hope to be able to post more about our findings in next weeks post.
In other news
Price per square metre data released
Our latest data release was released on Wednesday with over 17million rows of real estate data, completely free of charge.
This data release is something that we are extremely proud of and we love hearing about the different use cases that users are using this data for.
If you would like to read more about PPSM please click here
Does size matter?
This week we released Does size matter? Part II: All value matters relative to size which looked at the lockdown (due to COVID) impact on Residential and Commercial transactions.
Interested in finding out more? Contact us →