[Orekit Users] design of the DataProviders

Hello,

I think I have encountered a use case that, to my knowledge, Orekit cannot

handle. This has led me to investigate the inner workings of DataProviders. I

believe that the design, while very robust for "one-shot" simulations, is not

well suited for applications where the models data are a bit "dynamic" during

the execution. Could someone please confirm what I think I understood ?

My starting point is that I need to execute Orekit code in a

multi-threaded context, with thread-specific data providers. My actual

use case is about solar activity, but for the sake of simplicity I

will provide examples based on the 'tai-utc.dat' file.

Basically, I have 2 different versions of the data file "a.dat" and

"b.dat", and I need to run simultaneously thread A using a.dat,

and thread B using b.dat. Since DataProvidersManager is a

singleton, I have not found a way to do that.

I understand that my need is very specific. But in the process of simplifying my

test case as much as possible, I have found another strange behaviour of

DataProvidersManager, even in a mono-thread application. I find this more

problematic. Since model data are usually cached by the factory classes (for

instance TimeScalesFactory), it seems virtually impossible to change the

dataproviders during execution (even in a mono-thread context).

I have an example to illustrate this. I have built two data sets :

- dataset 1 uses a "correct" utc-tai.dat

- dataset 2 uses a modified utc-tai.dat, where I have added a 0.5s shift in TAI-UTC values

Now I perform a simple computation on each of them with the following method :

private void displayUTC(String datapath) throws OrekitException {

DataProvidersManager.getInstance().addProvider(new DirectoryCrawler(new File(datapath)));

TimeScale utc = TimeScalesFactory.getUTC();

AbsoluteDate date = new AbsoluteDate("1999-08-22T00:00:00", utc);

System.out.println(date.durationFrom(AbsoluteDate.GALILEO_EPOCH));

System.out.println(DataProvidersManager.getInstance().getLoadedDataNames());

}

My test code looks like this. Each method must be run in a separate

process, to make sure nothing is kept in memory between executions.

RUN1 :

public void test1() throws OrekitException {

System.out.println("Test 1");

displayUTC("C:\\dataset1");

}

OUTPUT :

Test 1

0.0

[C:\dataset1\tai-utc.dat]

This is the expected output.

RUN2 :

public void test2() throws OrekitException {

System.out.println("Test 2");

displayUTC("C:\\dataset2");

}

OUTPUT :

Test 2

0.5

[C:\dataset2\tai-utc.dat]

This is the expected output. Notice the 0.5s shift that I have

introduced in the data file.

RUN3 :

public void testBoth() throws OrekitException {

System.out.println("Test both");

displayUTC("C:\\dataset1");

DataProvidersManager.getInstance().clearProviders();

displayUTC("C:\\dataset2");

}

OUTPUT :

Test both

0.0

[C:\dataset1\tai-utc.dat]

0.0

[C:\dataset1\tai-utc.dat]

As you can see, the results for the second test change when it is executed right

after the first, in the same process. The dataset 1 is used twice, despite

clearing the DataProviders and reloading dataset 2. I believe this is because

the data is cached in the TimeScalesFactory. I think it would make more sense to

cache the data in the DataProviders (or maybe DataLoaders) instead of the

factories.

Finally, but this is a very very minor nitpick : the data loading mechanism is

based on data file names. This is a bit confusing when working with a

non-file-based storage, typically a database of some sort. Asking for data by

"type" (solar activity, earth orientation parameters...) would seem more

intuitive to me.

After re-reading this email, I feel like I am bashing the data loading

mechanism. Please do not interpret my feedback this way : I have used Orekit for

several years now, and this is the first time I feel like I have hit a hard

limitation. This is a testament to the overall design of the library.

I have started thinking about possible refactorings of the model data

management. I have a somewhat similar behaviour somewhere else in my software,

and I have used a dependency inversion based on Java services to solve it. So

far, it seems to work quite well (but my software is not that big yet, so it

might be a bit early to tell). Maybe something like this could be implemented

for orekit data management ? I would gladly share a very basic draft of my ideas

if it can be of any help.

Thank you for your time.

Yannick Jeandroz

TESOA2 - Flight Dynamics

T +33 (0)5 62 19 51 71

E yannick.jeandroz@airbus.com