Ideas on `OrbitArray` - poliastro/poliastro GitHub Wiki

Context: https://github.com/poliastro/documents/blob/master/numfocus-sdg-2021-r3.md

Contents of 2022-01-14 community meeting:

class BaseState:
    attractor: Body
    plane: Planes


class ClassicalStateScalar(BaseState):
    @property
    def a(self):
        return self._a


class RVStateScalar(BaseState):
    def to_classical(self):
        return ClassicalState(...)


class ClassicalStateArray(ClassicalStateScalar):
    pass


class RVStateArray(RVStateScalar):
    def to_classical(self):
        return ClassicalStateArray(...)


class _BaseOrbit:
    @property
    def a(self):
        return self._state.to_classical().a


class OrbitArray(_BaseOrbit):
    pass


class Orbit(_BaseOrbit):
    pass


def print_a(something_with_a):
    print("Semimajor axis: ", something_with_a.a)


def print_mean_a(something_with_a):
    # if not insinstance(something_with_a, OrbitArray)
    # if not hasattr(some.., "len"):
    print("Mean semimajor axis: ", something_with_a.a.mean())

Email sent to Sebastian on 2022-01-15:

`OrbitArray` use cases

Propagate a number of orbits

See for example this code:

for cache_idx, orb_idx in enumerate(range(orb_idx_start, orb_idx_stop)):

    orb = self._orbit_from_mpc(self._data[orb_idx])
    astro_timedeltas = self._astro_times - orb.epoch
    self._cache[cache_idx, :, :] = np.transpose(propagate(
        orb, astro_timedeltas
        ).xyz.to(u.AU).value)

which could be slightly rewritten as:

for cache_idx, asteroid_data in enumerate(self._data[orb_idx_start:orb_idx_stop]):

    orb = self._orbit_from_mpc(asteroid_data)
    astro_timedeltas = self._astro_times - orb.epoch
    coords = propagate(orb, astro_timedeltas)
    self._cache[cache_idx, :, :] = coords.get_xyz(xyz_axis=1).to_value(u.au)

You already noticed that there is no Orbit method for what we want here, which is why you needed to use poliastro.twobody.propagation.propagate, as explained in gh-1364.

Let's imagine that OrbitArray exists:

orbs = []
for asteroid_data in self._data[orb_idx_start:orb_idx_stop]:
    orbs.append(self._orbit_from_mpc(asteroid_data))

# We create the OrbitArray from a sequence of Orbit objects
# (that way we don't need to duplicate every Orbit classmethod)
orbarr = OrbitArray(orbs)

# Since each orbit can have a different epoch,
# we need a 2D array of times,
# which requires some NumPy-fu
# (In your example you used np.meshgrid instead, which is equivalent)
# Notice the plural in .epochs!
astro_timedeltas = (
    np.repeat(self._astro_times[None, :], len(orbarr), axis=0)
    - orbarr.epochs[:, None]
)
assert astro_timedeltas.shape == (len(orbarr), len(self._astro_times))

# And now each orbit in the array gets propagated
self._cache[cache_idx:cache_idx + len(orbarr), :, :] = propagate_array(orbarr, astro_timedeltas)

For this to work, so far we only need

from attrs import define


@define
class OrbitArray:
    orbits: list[Orbit]

    def __len__(self):
        return len(self.orbits)

    def __iter__(self):
        return iter(self.orbits)

    def epochs(self):
        return Time([orb.epoch for orb in self])


def propagate_array(
    orbit_array: OrbitArray | list[Orbit],  # Notice that a simple list of Orbit suffices for now
    timedeltas: TimeDelta
) -> CartesianRepresentation:
    # Easy implementation that does not do any parallelization or optimization
    # but retains the API
    coords_list = []
    for index, orbit in enumerate(orbit_array):
        coords_list.append(propagate(orbit, timedeltas[index]))

    # FIXME: The from_list method doesn't actually exist,
    # pseudocode for "assembly CartesianRepresentation object from the array above"
    return CartesianRepresentation.from_list(coords_list)

If we go one step further and attempt to solve gh-1364, we could add the method you propose, propagate_many. We can make it take the times instead of the timedeltas and avoid the NumPy broadcasting magic in user space:

coords = orbarray.propagate_many(self._astro_times)
self._cache[cache_idx:cache_idx + len(orbarr), :, :] = coords.get_xyz(xyz_axis=1).to_value(u.au)

At the moment, the proposed solution for [gh-1364]
involves adding a new method to `Ephem` instead,
but to simplify the discussion we can go ahead with `propagate_many`.

That looks much better! And the implementation would be something like this:

@define
class OrbitArray:
    orbits: list[Orbit]

    def __len__(self): ...
    def __iter__(self): ...
    def epochs(self): ...

    def propagate_many(self, epochs_array: Time) -> CartesianRepresentation:
        # Easy implementation that does not do any parallelization or optimization
        assert epochs_array.ndim == 2

        coords_list = []
        for index, orbit in enumerate(self):
            coords_list.append(propagate(orbit, epochs_array[index]))

        # FIXME: The from_list method doesn't actually exist
        return CartesianRepresentation.from_list(coords_list)

This proposal:

Fulfills your use case
Makes me happy in terms of OOP design
Gives us a blueprint to start making propagate_many leverage multiprocessing, CUDA, whatever
Requires minimal changes to poliastro at least initially
Allows evolving more advanced container-like behavior like slicing

Things left out:

"Propagate an array of Orbits a different sequence of epochs for each orbit". I don't think that use case is so typical, and it force us to implement mesh-like functions. I once tried to write very smart code using np.einsum and ended up rolling it back.
inplace=True. To me, the line tiled_orbs.propagate_many(epochs, inplace=True) suddenly placing a matrix of data inside tiled_orbs looks quite fragile, and I don't see the advantage over doing coords = tiled_orbs.propagate_many(epochs) anyway.
OrbitArray.epoch. I went for .epochs inspired by Series.dtype and DataFrame.dtypes. "If the user already knows what they want", even if it breaks our loosely defined "duck typing", they might as well use a slightly different property name.
NumPy-like behavior of OrbitArray. As you say, this is a lot of work: both on the implementation side and on the design side (we surely can agree on what np.repeat(orbarr, 3) does, but there are plenty of NumPy functions to discuss). Still, I 100 % agree with you with not closing the door for it.

More ideas

Let's not confuse .propagate for returning a new Orbit with .propagate for returning the coordinates! poliastro own terminology makes this extremely confusing, as https://github.com/poliastro/poliastro/issues/1364 describes.

Also, see https://github.com/astropy/astropy/issues/12600#issuecomment-1003044555 for a current status of astropy.units with non-NumPy data containers.

I think I see the light at the end of the tunnel:

orbarray.propagate(single_instant) returns another OrbArray with modified orbits, for consistency withOrbit.propagate(single_instant). perturbations allowed as showcased inthe quickstart guide.
orbarray.WHATEVER(timestamps_array) orWHATEVER(orbarray, timestamps_array) (seegh-1364 for discussion) returns aCartesianRepresentation matrix of coordinates N x M x 3 with physical units (in principle, adding the units should happen at the last step and we shouldn't carry them around, mitigating the performance impact).

comments:

OrbArray.propagate(single_instant, inplace=True) to be discussed at a later stage, even though I'm initially against it. in any case, it should not affect the overall design because we have to supportinplace=False as well.
OrbArray.propagate(timestamps_array) (which would return "an Orbit Matrix") left out for now, since in principle the use case would be covered by (2)
OrbArray.WHATEVER(timestamps_array, out=arr) to be discussed at a later stage, when we figure out a better way to interoperate with other containers. maybe https://uarray.org/ is the answer, but probably it's just hard to do and we won't have time for it.

there's lots of possibilities here but I think if we manage to do 1 thing that works and solves a real problem, people will start actually using it and help us prioritize further development, possibly with further rounds of funding, or a GSOC student.