Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.

MasterCatalog

martin schouwenburg edited this page Feb 17, 2015 · 4 revisions

The MasterCatalog is a repository were all data-sources ilwis has ‘seen’ are registered

###Design Rational The user of the framework needs to have one consistent view of the data/operations he uses. As the goal is to have data of many sources the view how these are addressed can quickly become rather complicated. Locating a local file is different from a service which in again different from say a database connection. The master catalog is designed to abstract from all these different addressing methods and to offer one view on all the data it is connected to.

###Description A data source(in our case) is anything that can be translated into an IlwisObject. Ilwis scans folders, queries remote sources through its catalog connectors to discover which data sources can be found at the location. The sources it can recognize are stored as ‘Resource’ in the MasterCatalog. Apart from being a view on all 'seen' data this class has also a secondary function. Registering all instantions of resources in the mastercatalog. When a resource is instantiated to a true IlwisObject it registers itself in the master-catalog. This ensure that all objects can be retrieved through their id or resource (or url/type) and that there can never be a duplicate IlwisObject instantiation.

####Master Catalog as 'File System' The first function of the MasterCatalog is stored as a database and thus can be queried as such. The data of the resources is stored in two tables : MasterCatalog and CatalogItemProperties.

Column Name Type Description
ItemId ULong Long A unique integer identifying the resource
Name String A non-unique string naming the source
Code String A unique code identifying the object. Usualy the code is defined by an external authority (e.g. epsg codes). Optional
Container URL A source is ‘contained’ in something. This maybe a folder, a service location or any other means which can indicate containement. The containement is used to identify relations between data-sources.
Resource Url Path to the data-source. Together with the type this is a unique way of addressing the data-source. This is a 'normalized' url
Type ULong Long A ilwis defined type identifier. This type is the main type of the data source. The role in which this type of data source is usually seen. See also ‘Extended type’
ExtendedType ULong Long The extendedtype is the additional roles this type of data source can play. For example, a geotiff file also contains all the necessary information for its spatial reference system. So, data-wise, it is a spatial reference system though of course its primary role as raster coverage
Dimensions String Dimensions mean different things for different data sources but the they are (in ilwis terms) standardized per type. So one can use this to query.
Column Name Type Description
ItemId ULong Long A unique integer identifying the resource
PropertyName String Name Identifying the property
PropertyValue String String representation of the value

This function of the master-catalog is quite complex as it has many convenience methods that convert url/names/resources/codes to id's and vice versa. After all this is the main task. ( see also master-catalog ). This function of the mastercatalog will never contain anonymous objects (see...).

#####Usage The mastercatalog is a singleton in every instance of the Kernel. It is available through the mastercatalog() method which is globally available (if you include mastercatalog.h). This gives you access to mastercatalog instance. For a description of the interface go to master-catalog. So for example

context()->setWorkingCatalog(ICatalog("file:///d:/data/gisdata"));
Resource myraster = mastercatalog()->name2resource("somemap.tif", itRASTER);

will retrieve the resource for "somemap.tif" from the mastercatalog. The first line is a short-cut make the default lookup catalog "file:///d:/data/gisdata", no pathing needed. This can be convenient if you work from one location. Note that the working catalog can be anything that can be defined as a catalog.

mastercatalog()->addContainer("http://some.server/wfs?service=WFS&request=GetCapabilities&acceptversions=1.1.0");

will add a new catalog ( the url) to the master-catalog together with it contents as this will be automatically scanned when adding.

MasterCatalog as Object Registry

The second, maybe even more important function, is the function as object-registry. Every ilwis-object is a unique combination of url and object type. It can only exist once in every instance of the kernel. This means that when one tries to open such a combination, the system must be able to retrieve a previous(still existing) instance of that combination. The master-catalog is the central repository for instances of objects.

#####Usage The master-catalog always holds one instance in a shared pointer and returns this ( and thus upping the reference count of the shared pointer) when a request is made for a "copy" of the object. For example

ICoordinateSystem csy = mastercatalog()->get(4897);
IGeoReference grf = mastercatalog()->get("file:///d:/data/n00032.img", itGEOREF);

If the last instance ("copy") outside the mastercatalog goes out of scope, the master-catalog also cleans up its instance and the memory is finally freed. It still exists as resource though in the first function of the master-catalog. Anonymous (temporary objects) are in the list of instances but not in the list of resources.

It is (very) important to note that one should not do:

GeoReference *grf = new GeoReference(someresource)

As this will bypass the master-catalog and thus messes up the registration. This only allowed in the connectors were at designated interface functions (IlwisObject *create() const) the actual object is created. This function is called by the framework and thus it operates in a expected way. The "I" objects (e.g. IRasterCoverage, IGeoReference, ...) are safe to use as they correctly update the master-catalog

IGeoReference grf(someResource);

Is Fine

###Contents of the mastercatalog The MasterCatalog is a key concept in ILWIS NG so its contents must always be correct. But the MasterCatalog, just like IlwisObjects, doesn’t know anything about data-sources. To fill its content it relies on a mechanism that is very similar to the IlwisObject programmatic flow. The location connector is selected based on the nature of the location. For example, file urls can be read by a location connector for gdal or Ilwis3 format files, while a location based on a OpenDap url can be read by the OpenDap connector. Once the connector is chosen it starts to read the content of its location and translates every data-source it finds in a Resource and adds these to the MasterCatalog. Catalogs are views on the MasterCatalog based on a query

In more detail

  1. Context: We have an url that points to a collection of data(sources). This maybe a simple url to a local folder, GetCapabilities of WMS etc..
  2. There are known two cases. Either the catalog is known with this location or it is a new location. In the first case(not in the diagram), it creates a catalog object and fills it with data as found earlier. This catalog is returned to the client.
  3. In the second case: The MasterCatalog passes this url to the MasterFactory. The MasterCatalog doesn’t know what this url means but the MasterFactory has access to factories that might know what to do with this url.
    
  4. The MasterFactory queries the known (catalog) factories if they have a connector that can handle this kind of url. If so the connector is created and added to the catalog. There is a difference here between IlwisObject and Catalogs. A Catalog may have more than one connector. This is relevant in cases that a location is served by more than one connector. Each connector is valid for a location and might produce additional results. For example. A local directory might be served by a GDAL (catalog) connector and a NetCDF (catalog) connector. The  gdal connector sees only files that can be handled by gdal, the netcdf connector sees only files that are netcdf files. The combination of these two (assuming that there are no further connectors relevant) fills the MasterCatalog (and subsequently the catalog) with (in this case) file based resources. The MasterCatalog has now a format independent view on the underlying data.
    

Note that the filling of the MasterCatalog follows the pointer to the WorkingLocation(s). The WorkingLocation is a location that has now the focus of work (e.g. a working folder). The MasterCatalog will index those locations. So it will see only that which is necessary.

Clone this wiki locally