To derive an optimal observation system for surface ocean

The ocean is a major sink of anthropogenic CO

The majority of observations contributing to the Surface Ocean CO

Here, we extended the scope to the Atlantic basin, including the Atlantic
sector of the Southern Ocean. We explored design options for a future
augmented Atlantic-scale observing system that would optimally combine data
streams from various platforms and contribute to reduce the bias in
reconstructed surface ocean

The remainder of the article is structured as follows. Sect. 2 presents the model output, the observing systems, observations, the design experiments, and the description of the statistical model. Results are presented and discussed in Sect. 3. Section 4 is dedicated to the conclusion and the presentation of perspectives.

Here we present the ensemble of observing platforms that either already
perform measurements to estimate

Three observing platforms were selected for the study: (1) volunteering
observing ships providing in situ measurements of surface ocean CO

Spatial distribution of datasets used for training (number of
measurements per grid point and 5 d time step):

Here we used the numerical output from an online-coupled
physical–biogeochemical global ocean model, the Nucleus for European Modelling of the
Ocean (NEMO)/PISCES model, at 5 d
resolution. This configuration of the NEMO framework was implemented on a global tripolar grid. It coupled
the ocean general circulation model OPA9 (Madec et al., 1998), the sea ice
code LIM2 (Fichefet and Maqueda, 1997) and the biogeochemical model
PISCESv1 (Aumont and Bopp, 2006). Information on the simulation is given in
Gehlen et al. (2020) and Terhaar et al. (2019), including the evaluation of
the modelled mean state and the seasonal cycle of sea surface temperature
and sea–air fluxes of CO

Table 1 summarises experiments designed for different combinations of observing platforms.

Information on Observation System Simulation Experiments.

The first test is based on individual sampling data extracted from the SOCAT database. As mentioned before, these data provide a good coverage of the Northern Hemisphere. The lesser coverage in the Southern Hemisphere results in a larger dispersion of methods based on these observations only (Denvil-Sommer et al., 2019; Rödenbeck et al., 2015). This has motivated experiments with additional data from Argo profilers limited to the Southern Hemisphere. An experiment based on the full physical ARGO network was included to evaluate the method for a high spatial and temporal coverage (an optimal, yet unrealistic case).

We have tested combinations of SOCAT data and (1) total Argo data, (2) Argo only in the Southern Hemisphere, and (3) 25 % or (4) 10 % of the initial (total) Argo distribution. Finally, these experiments were repeated with additional mooring data. It is worth noting (Table 1) that OSSE 4 is closest to the target of the BioGeoChemical (BGC)-Argo program, with a BGC-Argo density corresponding to 25 % of the existing Argo distribution. However, we decided to choose OSSE 3 as a benchmark against which to evaluate individual experiments. This experiment has a high data density and provides additional information on a potential future BGC-Argo network.

We used a feed-forward neural network (FFNN) based on Denvil-Sommer et al. (2019) to reconstruct surface ocean

The numbers of hidden layers and parameters/weights depend on the number of data used for training. In this work, the FFNN was applied separately for each month (one model for January, one model for February, etc.). A sub-set of 50 % of data was used for training. A total of 25 % participated in the evaluation of the model during the training algorithm, and 25 % were used to validate the model after training. These data were chosen regularly in time and space: every third grid point was kept for evaluation, and every fourth grid point was kept for validation. Tables S1 in the Supplement presents the numbers of training data for each month and each OSSE. To adjust the number of FFNN parameters/weights we followed the empirical rule that suggests limiting the number of parameters to the number of training data points divided by 10 to avoid overfitting (Amari et al., 1997). The FFNNs for all OSSEs except OSSE 2 have four layers (two hidden layers) with 1116 parameters in total. The input layer has 15 input nodes and 20 output nodes that represent the input for the first hidden layer. The first hidden layer has 25 output nodes, and the second hidden layer has 10 output nodes. The OSSE 2, which is based on Argo data for the period 2008–2010, has significantly fewer data for training, and thus the FFNN for the OSSE 2 is different: three layers (one hidden layer with 20 input and 10 output nodes) with 541 total parameters.

All data have to be normalised before their use in the FFNN, as exemplified
for SSS:

Normalisation is required to rank all predictors on the same scale and to avoid the possible influence of one predictor with strong variability (Kallache et al., 2011).

Following Denvil-Sommer et al. (2019) we normalised the geographical
positions (lat, long) in the following way:

The comparison between OSSEs is done per biome, following Rödenbeck et al. (2015) (Fig. 2, Table 2). Biome 8, North Atlantic ice, has been omitted due to poor data coverage in all OSSEs. It is expected that reconstructions over this region will yield large biases susceptible to interfere with the interpretation of results from individual OSSEs.

Map of biomes (following Rödenbeck et al., 2015; Fay and McKinley,
2014) focused on the region 70

Biomes from Fay and McKinley (2014) used for time series comparison (Fig. 2).

In order to simplify the comparison, we used Taylor and target diagrams with
standard deviation, biases, correlation and normalised RMSD (uRMSD) of the
mean of four FFNN outputs for each OSSE. Here uRMSD is estimated as follows:

Further, the maximum absolute value from four outputs, maxValue

The final mean difference meanD

The SD of the mean difference Diff

The time series of the mean value from four FFNN outputs for

Figure 3 shows the Taylor diagram (correlation coefficient between
reconstructed

Taylor diagram of 11 OSSEs summarised in Table 2; the colour code
corresponds to Fig. 2, and the purple colour represents all of the eight biomes combined:

Target diagram per biome for 11 OSSEs, the colour code corresponds
to Fig. 2, and the purple colour represents all of the eight biomes combined:

Correlation coefficient and standard deviation (

Normalised root-mean-square differences and biases (

OSSE 4 (square) and OSSE 5 (rhombus) are based on OSSE 3, the only
difference being the percentage of Argo data used: OSSE 3 uses 100 %, OSSE 4 uses 25 %
and OSSE 5 uses 10 %. The results of OSSEs 4 and 5 are similar to those
obtained for OSSE 3. The largest difference is observed over biome 17 (Figs. 3, 4i): correlation coefficients are 0.85 (OSSE 3), 0.77 (OSSE 4), and 0.75
(OSSE 5); biases are

OSSEs 6 (triangle), 7 (inverted triangle), and 8 (pentahedron) were trained on
SOCAT data complemented with Argo data in the Southern Hemisphere. In
general, the skill scores are lower compared to OSSE 3, especially for OSSE
8 (10 % of Argo data in the Southern Hemisphere) where results approach
those of OSSE 1 (Fig. 3). Large differences are obtained for biomes 12 and
17 (Figs. 3, 4e and i): in biome 12 (17), correlation coefficients for
OSSE 6, 7, 8 are 0.64 (0.86), 0.54 (0.8), and 0.52 (0.66) compared to
0.79 (0.85) for OSSE 3; uRMSDs are 11.46 (10.01), 13.3 (11.03), and
13.87 (15.16)

Differences (Eq. 4) between OSSE FFNN outputs and NEMO/PISCES

Reconstruction skill scores are improved by the addition of data from
mooring stations to OSSEs 6, 7 and 8 in OSSEs 9 (hexagon), 10 (star) and 11
(triangle centroid) (Fig. 3 and 4, Tables 3 and 4). Over the ensemble of eight
biomes the decrease in the number of Argo data goes along with a general
decrease of correlation coefficients, i.e. 0.88 (OSSE 9), 0.85 (OSSE 10), 0.83
(OSSE 11), and an increase of uRMSDs, i.e. 8.37

Figure 5a, b and c present the differences between reconstructed

Differences between OSSE FFNN outputs and NEMO/PISCES

Figure 5d, e and f present the standard deviations (SD) of differences for
all four outputs for each OSSE FFNN (Fig. 5d – OSSE 1; Fig. 5e – OSSE 3; Fig. 5f – OSSE 10) (Eq. 5). Over most of the Atlantic Ocean, SD varies between 0 and 10

Figure 6 shows the correlation between the mean value of four OSSE outputs and
NEMO/PISCES

Correlation coefficient between OSSE FFNN outputs and NEMO/PISCES

Correlation coefficient between OSSEs and NEMO/PISCES

In Fig. 7, time series of

Figure 7c–h illustrate time series of reconstructed

Biome 13, the subtropical permanently stratified South Atlantic (Fig. 7e
and f), corresponds to a region with a low data coverage. This region has a
dynamic similar to biome 11 in the Northern Hemisphere; however, the data
coverage in biome 13 represents only 15 % of data coverage in biome 11
(Fig. S5). We observe a large difference between

The Southern Ocean ice biome (biome 17) is characterised by sparse data
coverage and a bias towards the ice-free season. The results for biome 17
are presented in Fig. 7g and h. OSSE 1 underestimates the

Results for all OSSEs and for all biomes are included in the Supplement (Table S4, Figs. S6–S11).

Figure 8 shows the sea–air CO

The relationship between the average number of Argo floats (5 d period)
and the error in

Averaged number of Argo profiles per 5 d time step over
2008–2010 versus averaged differences between each OSSE

The aim of this work was to identify an optimal observational network of

The results suggest that the addition of data from Argo floats could
significantly improve the accuracy of FFNN-based ocean

The reduction of the percentage of Argo data used in our experiments slightly
decreases the accuracy (Figs. 3 and 4, Tables 3 and 4). A lower percentage of
Argo data corresponds, however, to a more realistic distribution of
instruments and to the target of the global BGC-Argo network. The results
are still comparable to OSSE 3. The best compromise between the statistics
yielded by the comparison between reconstructed

The OSSE 10 network could be further improved by instrumenting Baffin
Bay, the Labrador Sea, the Norwegian Sea, and regions along the coast
of Africa (10

The inclusion of errors from in situ measurements is one of the next steps of this
work. The real measurements contain instrumental and representation errors.
The inclusion of errors in pseudo-observations will help to estimate the
impact of observations on the reliability of OSSEs presented in this work.
It will include the errors for predictor values (SSS, SST, SSH, CHL, MLD,

Code that provides an estimation of OSSE 3 for July and code used to create figures can be found at

Data used within this study are available upon request. Please contact the corresponding author.

The supplement related to this article is available online at:

ADS, MG and MV contributed to the development of the methodology and designed the experiments, and ADS carried out the experiments. ADS developed the model code and performed the simulations. ADS prepared the paper with contributions from all coauthors.

The authors declare that they have no conflict of interest.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors would like to thank an anonymous referee and Luke Gregor for
their helpful comments and questions and Florent Gasparin for providing the reference data
of Argo distributions. At present Anna Denvil-Sommer is under funding
from the Royal Society (grant no. RP

This research has been supported by the European Commission, H2020 Research Infrastructures (AtlantOS, grant no. 633211), and GreenGrog (GMMC).

This paper was edited by Katsuro Katsumata and reviewed by Luke Gregor and one anonymous referee.