[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Orekit Users] Test failure

Does this help?

--- maven-jgit-buildnumber-plugin:1.2.10:extract-buildnumber (default) @ orekit ---
Git info extracted, revision: '31900f4', branch: 'master', tag: '', commitsCount: '4008', commitDate: '2018-05-27', buildnumber: 'master.4008.31900f4'

--- maven-jar-plugin:3.0.2:jar (default-jar) @ orekit ---
Building jar: /home/walter/repositories/orekit-main/target/orekit-9.2.jar

Running org.orekit.estimation.sequential.UnivariateprocessNoiseTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 0.491 s - in org.orekit.estimation.sequential.UnivariateprocessNoiseTest

On Mon, Jun 4, 2018 at 9:54 AM, MAISONOBE Luc <luc.maisonobe@c-s.fr> wrote:
Hi Walter,

Walter Grossman <w.grossman@ieee.org> a écrit :

I am a newbie to OREkit.  I ran tests and go a "near-miss" failure.  I
resolved by relaxing precision.  How do I know if I am OK?

OrbitDeterminationTest.testW3B:384 expected:<0.687998> but

found this line:  Assert.assertEquals(0.687998, covariances.getEntry(6, 6),

Is the problem that acceptance criterion too tight?  Why?

The test tolerance is intentionally extremely small, see below
for the rationale for this stringent choice. The test should however
succeed with the current settings. Could you tell us which version
of Orekit you use (development version from the git repository, released
version?) and with which Java environment (OS, JVM version, processor)?

Some tests in Orekit are built in several stages. First the test is
created without any thresholds and only output its results, which are
compared by the developer with whatever is available to get confidence
on the results. This may be run of other reference programs if available,
this may be another independent implementation using different algorithms,
or this may be sensitivity analysis with the program under test itself. This
validation phase may be quite long. Once developers are convinced the implementation
is good, they run the test one last time and register its output as the
reference values with a stringent threshold in order to transform the nature
of the test into a non-regression test. The threshold is therefore not an
indication that the results are very good, it is only a way for us to ensure
that any change in the code that affects this part will break the test and
will enforce developers to look again at this code and to decide what to do.
They can decide that the changes that broke the test are valid and that they
only changed the results in an acceptable way (sometimes to improve the results),
so they change either the reference value or the threshold. They can decide that
the changes in fact triggered something unexpected and that they should improve
their new code so the test pass again without changing it. So as a summary
thresholds for non-regression tests are small to act as a fuse and people notice
when it blows up and can take decisions.

best regards,