Archive for category Data Transformation

Data Warehousing Goodness with FDO Toolbox (sorta)

Ok, not real data warehousing (no star schema here) but anyone who has dealt with performance issues in MapGuide due to on-the-fly joins across heterogeneous data sources knows the value of loading read-only data sets into static SDF (or SQLite!) files for rapid display.

With release 0.50 of FDO Toolbox, Jackie has done it again by allowing users to quickly take data from both spatial (FDO) data sources, and non-spatial (OLEDB) data sources, join them together, and write out performance-optimized files. This is a huge boon for folks that need to do this and can’t afford best-of-breed proprietary tools like Safe Software’s FME.

Jackie has really impressed me with the rapid development of this tool as well as his focus on providing value in three distinct areas. First, FDO Toolbox has a great GUI for FDO data transfer and administration. Second, the command line capabilities allow you to set up scheduled translations to keep your SDF files in sync with your corporate data stores. Finally, FDO Toolbox has a minimal profile and can easily be used by install scripts that need to load data, register FDO providers, and other tasks during an automated application installation.

What’s next on the horizon? Jackie’s recent post on an FDO plug-in for SharpMap provides a hint… spatial data inspection coming soon to FDO Toolbox! Now if there was only a way of plugging FDO Toolbox into MapGuide Maestro to transform data and either create packages or load data directly into MapGuide. ;)

-J

, ,

1 Comment

JSON and GeoJSON in FME

Many of you know about JSON, an object serialization scheme that has rapidly gained acceptance in AJAX-style applications. What you may not know is that there is an effort to standardise the representation of JSON-ified spatial features, known as GeoJSON.

FME is usually quick to support new formats (like KML and GeoRSS) but this time Safe has surpassed themselves, getting early JSON and GeoJSON support into their betas before the GeoJSON specification has reached a release version. Tonight I took some time out to play with this new support.

The basis of JSON support in FME is provided by two new read/write formats: JSON and GeoJSON. These new formats are augmented by two new transformers: JSONExploder and JSONExtractor. To get started, I’m going to show you how to extract data from a JSON source I happen to have lying around (it’s publicly available at Yahoo Pipes), which looks like this:

JSON Raw

When you first import this data source into FME, it is imported with the top level of attributes broken out. In this case, Pipes returns a top-level object with several attributes (link, description, etc) that you can see in my test workspace below:

JSON Workspace

Now, this isn’t much use, because my features are hidden inside the “items” attribute. In order to get them out, I first need to explode my single object into multiple features. The new JSONExploder transformer comes to the rescue here:

JSONExploder

Now, I have a unique feature for each of my feed items, but I really want some of the nested attributes. In particular, I want the description from the root of the item, and the nested y:location["lat"] and y:location["lon"] attributes. The JSONExtractor makes it easy to pull these out into new attributes:

JSONExtractor

And once adding a couple more of these, each of my features has some nice attributes attached to it, which I could then turn into points if I wanted:

JSON Attributes

OK, so that’s kinda cool from a straight ETL standpoint. I can take in JSON, mess with it, and then pump it out into whatever format I want. But the fun stuff is when you start getting into GeoJSON. Fortunately there are a couple early adopters, Christopher Schmidt and Howard Butler, who gave me some feeds to play around with. The first of these comes from Christopher’s super-flexible FeatureServer application (check it out, it’s open source):

FeatureServer Demo in OpenLayers

The features displayed on this OpenLayers map can be easily downloaded from FeatureServer in GeoJSON format (or KML, or GeoRSS, or whatever). The URL for the GeoJSON representation is:

http://featureserver.org/featureserver.cgi/scribble?format=geojson

Pulling this into FME is as simple as creating a new FME data source, and specifying the URL:

Add GeoJSON

As you can see, you can then treat this data like any other spatial data source:

GeoJSON Visualizer

Now, for a final example… Howard has a GeoJSON resource collection of counties in Iowa, accessible in a pattern something like this:

http://geoservices.hobu.biz/political/json/johnson

Now we could take this feature, in its source projection of UTM Zone 15N NAD83, but Howard’s put together a really nifty (non-commercial use only, unless you want to pay Howard some $$$) JSON-based web processing projection service. Not only that, but he’s also made it smart enough to interpret projections referenced locally, but also from the oh-so-cool (and built as a collaborative effort between Christopher and Howard) SpatialReference.org. So, all you need to do is feed it the URL of your source data, the url of your source CRS (http://spatialreference.org/ref/epsg/26915/) and the url of your destination CRS (http://spatialreference.org/ref/epsg/4326/). Like so:

http://geoservices.hobu.biz/project/?url=%22http://geoservices.hobu.biz/political/json/johnson%22&inref=http://spatialreference.org/ref/epsg/26915/proj4/&outref=http://spatialreference.org/ref/epsg/4326/proj4/

And, as just another link in the dynamic web chain, FME can read this transformed JSON feature:

HoBu GeoJSON Example

Now, for desktop FME users, this gives us “Pipes on Steroids”: all the mashup flexibility of Yahoo Pipes, with the huge format support and rich processing model of FME. As cool as this is, I think the real power will be seen whenever Safe integrates this functionality into their Server product. It will allow them to play well on both the “enterprise” traditional GML/WFS/etc level and on the neogeography JSON/GeoRSS/KML mashup level with a single product from a single (or multiple if you want) data source. For organisations that want turn-key interoperability solutions, FME Server is going to rock your world.

-J

P.S. I’m thinking about getting a personalized plate that says GEO JSN :)

No Comments

SQL Server 2005 – Safe at Last!

I admit it: I use SQL Server 2000 at work. It acts as my job scheduler, and it has handled my daily spatial and tabular data transformations for over three years. Most of the work is done by Data Transformation Services (DTS) packages and a vbs script that calls FME.

Those of you who have used Safe Software’s FME Workbench know that it is a great tool for performing spatial extract/transform/load (ETL) operations. It’s a unique way of setting up extremely powerful location data transformations without having to write a single line of code (but you can… in TCL and, more recently, Python). Workbench is an environment similar to DTS and ESRI’s Model Builder; FME’s implementation just happens to be more user-friendly and powerful:

FME Workbench Screenshot

Two recent developments in this field have made me into one extremely happy camper.

The first of these developments is the new version of DTS packaged with SQL Server 2005, which contains enough upgrades that it warrants a fancy new name: SQL Server Integration Services (SSIS). With this release, Microsoft has introduced a two-plane design paradigm, with the outer plane controlling process flow, and the inner plane controlling data flow. I can’t cover much here; suffice to say that it’s extremely powerful, easy to use, and that this feature alone makes the upgrade from 2000 worthwhile. The Microsoft SQL Server Developer Center is a good place to start learning more about SSIS if you’re interested.

The second of the recent developments ties into the first. Safe Software has done a soft launch on a beta FME SSIS plugin that encapsulates all of the cool Spatial ETL capabilities of FME inside SQL Server! Now I can set up a central SSIS workflow that manages all of my spatial transformations. I can also (and this is a big plus) make use of SQL Server’s native data transformation capabilities. This means that I can join disparate tabular data mid-transform to various spatial data sources before outputting a final product. I can also set up rollback routines without resorting to complicated scripting.

The Safe engineers have done a great job on this integration. Apart from some very FME-ish property and data source dialogs (which is a good thing™) it works seamlessly within the context of SSIS. I am happy to see that they managed to get the visualiser to work from within SSIS as well.

Here are some example shots from their web page:

Opening an ArcSDE data source
Opening ArcSDE Data Source

Setting the properties of a LengthCalculator transformer
Setting LengthCalculator properties

Running the SSIS package (ohhh pretty colours… would like to see this in Workbench)
Running SSIS

Viewing spatial results in FME Viewer
Visualiser Output

Viewing CSV results in text editor
CSV Output

Incidentally, this is not the only integration project that Safe has done. If you use Informatica they have also set up a beta program for FME integration with PowerCenter. If you’re an ArcGIS user, the Data Interoperability Extension gives you the same kind of integration with Model Builder (along with native data access to dozens of formats not supported directly by ESRI) and the 9.2 edition will come with a refresh to the FME 2006 GB code base.

I am amazed at Safe’s dedication to the leading edge. First there was Google Earth KML support, next came support for the SDF+ data format (an open format used for MapGuide Open Source and by Autodesk in its geospatial applications) and now this. What’s next? GeoRSS support?

-J

P.S. If you haven’t signed up for the First Annual FME User Conference, do it now. Vancouver is a great city, and I’m sure we’re going to have lots of fun learn lots of useful stuff there.

1 Comment