ET WDC 2020 5 - wmo-im/et-acdm GitHub Wiki

Date and Time

1 September 2020, 13-15 UTC

Venue

Telecon

Participants

  • Jörg Klausen
  • Judd Welton
  • Alex Vermeulen
  • Markus Fiebig
  • Tom Kralidis
  • Atsua Kinoshita
  • Gao Chen
  • Nate James
  • Valerie Dixon
  • Judd Welton
  • Christopher Lehmann
  • Drasko Vasiljevic
  • Tomoaki Nisizawa
  • Eduardo Landulfo
  • Keiichi Sato
  • Claudia Volosciuk
  • Dietrich Feist
  • Jeannette Wild
  • Stoyka Netcheva (Rapporteur)

Excused

  • Anatoly Tzvetkov
  • Debra Kollonige
  • Ryan Stauffer
  • Kjetil Tørseth
  • Enrico Fucile

Agenda

  1. Welcome and acceptance proposed Agenda (Jörg 5’)

  2. Approval of minutes from the meeting on 11 March and 11 June – corrections, comments (Jörg 5’)

  3. Status of survey on Data quality control practices implemented by GAW World Data Centres and Contributing Networks (Claudia, 20’)

WRDC QA-QC

  1. Report on progress, issues and challenges for aligning MD with WIGOS in OSCAR (all, 70’ )

WDC

WDCPC: Christopher Lehmann, WDCPC Status Update

WDCA/WDCRG: Markus Fiebig, Connecting WDCA/WDCRG to WIGOS&OSCAR: Lessons and Issues 

WOUDC: Tom Kralidis

WDCGG: Atsuya Kinoshita

WRDC: Anatoly Tzevtkov

Contributing Networks

Summary of provided updates: Drasko Vasiljevic (SHADOZ)

Individual presentations:

  1. Next meeting (Jörg 5’)

Minutes

[in preparation, cf. links to presentations above]

  1. Welcome and acceptance proposed Agenda

JK: The objective of today’s meeting is to hear about the progress of each data base, what challenges you faced, what you learned, what you can contribute and what we can do more to advance this process.

  1. Approval of minutes from the meeting on 11 March and 11 June – corrections, comments

No comments or corrections made on the minutes and were accepted.

  1. Status on Data quality control practices implemented by GAW World Data Centres and Contributing Networks

CV presented a summary from the survey on data QC tools and practices of Data Centers and DBs related to Item 8.5 from GAW IP 2016-23. 14 responses received with many links of documents and relevant information. What the next steps could be?

JK: is the survey closed now or it could be – other DC can still response.

CV: It is open and can be kept as long as possible so DC can respond.

JK: encouraged everybody who has no responded. It is important for GAW and WMO Data Conference to know how this community operates. Claudia please pots the link again. Claudia volunteered to do to develop a paper out of this for data Conference in November is useful and as publication as GAW report as where we are as community and find commonalities and approaches and various participants can benefit from what others are done. CV volunteered to coordinate such activity to extent possible. Do you see a value in such compilation? WMO does D Centers themselves also.

CL: It is useful to have a compilation of methods, commonalities in QC and Metrix for example in a form of a webpage with link of documentation with best practices of other groups to consult at least.

JW: I noted in the presentation that questions were interpreted slightly differently by responders. For example, just listening how this secondary QC is done it came to my mind- yes, we also do this. The survey was a good start, but it is worth compiling all this information in a grand google document which we can see and then it will suggest to us that we might have missed some information due to interpretation and make the result much more robust than it is now.

CL: Report are good for having stack of historical records. This could be the first step in iterative process. Having Google document is really good idea where to look at and edit it jointly and to see other methods and initiatives.

AV: I will fill in this survey as I have not filled it in again. CV: ICOS already participated.

TK: From our perspective QA-QC is on metadata and issues with data quality is pushed to data originators.

MF: It is worth turning those results with additional information in Report otherwise nobody sees it. It is also good to see what other groups are doing as one group is strong in one type of work other in others and we can learn from each other here.

JK: No negative voices heard. CV is tasked with this report. To create structure and circulate with the group headings or subheading on Google on github where we can jointly work on this report.

WMO Data conference – it will be good to have it for that as draft it will be excellent sign from this community. Claudia let us know – first draft for conference is challenge but to aim for 0 draft for then. We can start and then commend and expand.

GC: Is this related to the work of ET-ACMQ ?

SN: not related to their work but information will be shared.

  1. Report on progress, issues and challenges for aligning MD with WIGOS in OSCAR

DV: Reported progress, issues, challenges for progress with WIGOS metadata implementation.

CL gave the first to update. WDCPC – acidic depositions from world and presented data online and run laboratory. Thank you DV for help and information on WIGOS fields, OSCAR access and structure and tools. We have been working on our internal metadata structure and we this inspired us to work on the public release of WDCPC data , also using DOIs using an open source platform in Switzerland. We have internal validation interface, station metadata, instruments, consistency in metadata descriptors and data themselves, single server as data repository. Another platform to share data Open air – global assessment of at precipitation chemistry – platform works very well and provides DOIs as part of downloads. Working to upload metadata online, still having some issues with access but we are making progress and technicality we hope to be resolved in the future. JK: When do you document the steps you took to make this progress is it available? We asked DV to document and make to everybody available information on what and how it is done and we can provide feedback. We have a template we happy to share, survey and other documents we prepared could be shared. Any consistent information we will be happy to provide as resource.

JK: You have developed your own work flow from Seq server to OSCAR upload. Technology – templates, scripts, xml templates, codes you use will be helpful for others to share with other.

CL: It is still not M-2-M yet with xml. We are still having issues. XML exports we are doing with our own database are not matching OSCAR yet – it is not accepting them yet. Drasko is working with Tom on those. We are updating WMO manual that describes all of this. We have general workflows from each station in the world to use templates to send us data using templates this we can document in useful format.

JK: I hope Drasko follow up with this and document it. We also want to have the presentations for github.

MF: WDCA, WDRG: Issues we have are related with vocabularies, how to document licenses, some other minor issues. The more we work on the more issues with vocabulary items are not defined. To avoid – we moved with minimum information and implementation and we stayed consistent with information from facility/station MD to further maintain information will be for OSCAR to do this. Deployment was not used- Concerns are to not create fragmentations of data/time series in the future as for example due to replacing/exchange an instrument from the same type at one station. How transferable this will be is the question? For variables not yet defined we used NA which is actually defined in the Code List but OSCAR does not recognize it and does not accept that. We should not have NA in the Code List or find a way that OSCAR accepts it. One major issue was the vocabularies wit observed variables. For aerosols – often had to specify with modifiers what you are looking at. For example particle scattering coefficient which can be measured for several size fractions PM 1, PM-2.5, PM5, PM10. Observed at different temperatures and pressure. Depending of what kind of user you are - you might be looking at different things and by using modifiers and without modifiers things might not be comparable –WMO has vocabulary where all those modifiers are hard coded for each variable name for WDCRG you need to define more than 1700 new variable names. For only counting variables and modifiers which are used. We do not think it is feasible. Judd mentioned that he has similar issues and that they have wavelength dependence – which WDCRG probably has similar challenges as we have similar data in our DB and could be for WDCPC. This is an overarching issue that we need to address. How we could address that - we should not further define variables that include modifiers. We lock ourselves in something that is not flexible but instead amending WMO wmdr to have option to include modifiers. When we find solution, we can use it in preapproved variables. This will take time in slow WMO approval autocracy to approve it. New variables will comprise only core variable which can latter be amended with modifiers and we also need as community should start to collect info which exact modifiers will be needed by DCs.

Another issue is how Licenses are documented in wmdr. WMO has data policy and attribution. Those 2 things - policy and licenses are different but often combined. License is legal statement. There is no option how to document those in wmdr. We need an option to document data policy and data license separately. Also need way to identifying data license legally. We need those latter for brokering. Creating new product from existing data. This is list of things I am discussing with different contacts -that we might need in the future. Need license, need license identifier and need link to the license, need attribution, licenses as we might have more than one license on one data set, one for general public and entity different from that one. Issue we have with general term aerosol. Link to external metadata record. Work on DOIs.

JW: Just a comment on Data policy- I discovered for Contributing Networks 2 options in the code list - no data policy or inapplicable anything else says WMO data or something like this which is not accurate and will never use for my dataset. There might be more to add to the data policy.

JK: The policies to choose from WMO terminology is limited high level WMO Essential, WMO additional and WMO data. Those 3 are approved at this moment. WMO recognizes it is not enough for GAW and other programmes that have contributions from outside of NHMS but other sources. That is what to be explained at Data Conference in November which was called by Congress to discuss those issues. We will explain where our contributors and also we come from, what our needs are possibly and provide some solutions and this is where it is expected for us to make a contribution. Study group was formed at high level which is looking at data issues and policy and we have presentations from various members and different players (EUMETSAT, private sector, others) looking to rephrase and provide for congress approval new draft resolution that try to overcome many limitations which had been recognized by WMO.

MF: as you say we deal also with universities and other entities outside of NHM. Data are not limited to WMO use. Are those entities outside of WMO represented to those discussions for policy?

JK: Yes, they are.

Back to your presentation: Points you raised are in the focus of ET of WIGOS metadata that was established and ET on metadata that was also established – 2 WMO teams. Your Bullet point for collecting requirements for modifiers and variables is need work and we must develop appropriate data model for how we can view and specify variables and I welcome if you take leading role to collect those requirements and we can model them in most appropriate way.

GC: We are facing the same issues- for aerosols and I am interested in this work group. May be somebody from radiation requirements can be also good for this.

JK: we need to focus not how model looks like but there is work to be done to develop proper model, comprehensive, to allows extensions if time shows that the model needs it so those need to be practical, workable without implementing humongous machinery. I would like you to have discussions with Gao and myself to move this forward. We can not do it at this telecon but to organize another one.

MF: Yes, we need first to come with break up meeting and create an unedit list what is needed and then review. JK: We can use github to collect those requirements if you can take leadership to collect and create subsection on github to collect those things and to come to agreement in the future. Indeed, expressions of vocabularies, entire WIGOS model has proven to weak and insufficient at current structure for WMO and need to be corrected, then implemented, if we start today we can be ready for implementation in 6 months.

MF: Will it possible to have Jud in this group.

JW: I have documented them in multiple places but having them somewhere centralized will make sense.

JK: Lets have bilateral discussion in week or tow I will react to that.

TK: We carry on operationally. We work on WIGOS integration for WOUDC. We have done some initial work to publish metadata for OSCAR as NHMS we use the same approach (presented to this group a couple of times in the past) now to work for WOUDC station MD using same tools we have an initial implementation 1.5 weeks ago. Now we are sorting out how to add observations to the output WIGOS MD so they get push up to OSCAR accordingly. Hope to have it in next 8 weeks at which point changes will be made on the tools will be publicly available if any body is willing to use this capability. It takes a bit longer as we try to do it in non custom way and push this functionality in public tool which takes time opposed to specific way. When we did WIGOS MD through OSCAR we typically take facility first approach for each station we find facility record and then everything is centered around facility object. Example I have seen on adding observations in GitHub we find that it is observation centered and now we try to find a way without need to not change much underlined machinery. I might have Qs in next couple of weeks. We are a 1-2 months away from being implemented in production.

JK: Who has talk to you about pygeometa and pyOSCAR tools other than Tom?

DV: JW has done

JW: I am not using pyOSCAR but it was helpful how to see how it interacts with API. I have my own version l have one URbl (?) based version same capability further along observations side. We have spoken how to make one common approach of doing this might be python based for example rather that different approaches. It will be helpful for the community.

JK: any body else know of the library developed by Tom? Do you find it useful or difficult? It will be useful.

AV: we looked at the documentation more like example how IPR works as its documentation is not clear and it is easier to develop your own code.

TK: It is more for reference. Some people can use it directly or for just for reference – how things work. We have cases sometimes used directly or others just looking at them.

JV: used single command to ad xml and python script to upload. Instructed Tom from WDCPC to use them with simple instructions/description and he responded he is ready to try them himself.

JK: we want all pieces on this topic, documentation, scripts, codes and links at one place or linked to places where pieces can be found on github. Markus – please talk to Richard Richter if you can put your workflow, code, xml, MD to give will be great.

TK: For all activities we work on we typically have an page (wiki or other page) in repositories called “implementations” with information and note their progress and where people can see which way they can go. We can have a diagram what the data centers are doing and provide as much details as possible so people can see what can be done and decide which way to go.

AK: We work on the need to centralize and synchronize MD with GAWSIS and keep it up to date. I will report on the progress we made is on centralizations and activities. Current handling of MD is: There are 2 types MD handlers. Common and each WDC specific MD. Specific MD should be collected in processes independently by each WDC. Common MD can be split in 2 – core and extended MD. Common is basic – stations info, no room for each WDC to be involved. Once a week we check it. No discrepancies do exist now. When new station wants to be created in WDCGG we advise them to go to GAWSIS for GAW ID and enter common MD in GAWSIS. Discrepancies might exists when they send data we ask them to update MD. We need to align our MD with GAWSIS. Common MD goes to GAWSIS before. Extended MD might not be always present and need manual check. Centralized data submission is recommended. We need to establish MD distribution scheme. Work started work on expansion. Handling common MD is as before. Contributors enter Common MD in GAWSIS but entering extended MD in WDCGG who will send in xml file to GAWSIS and it stores extended MD. Challenges – create and combine MD in xml format to be decoded by OSCAR service. We use xml converter for one station in Japan as example done with warning. Thank you for the needed cooperation to Drasko and Lucia.

It was possible to register new station contact, station PI but only possible to update station WDGG observation without other WDCGG information. To finish xml conversion without losing any piece of information when converting but we get warnings for lack of required items in data device information. If it is possible to change - problem is we can only add information on station for Japanese stations – this is policy issue permissions. No technical issue. Need to have contact for each data set . We needed to develop tools for GAWSIS.

JK: I take note of slides and look at those – have you created tickets in the system? Was it reported to OSCAR helpdesk.

AK: some ID help from Drasko and Lucia.

JK: Drasko to follow up with this. Not aware of xmls that are uploaded as I am not on their flow. I can look at some examples- files if format was not in proper was or there was machinery issue. Limitations to observation points might be related to what is your role in OSCAR. Question is what identity you used- may be it was as NHMS not as WDCGG center. Those 2 identities are different. As WDCGG you can not use NHMS role. What I would like to note is that the MD flow described by you is exactly how we as a group need to agree on to the exchange of MD with other archives global or problematic. Common/basic/station MD is best left to be entered and updated by member countries operating stations as in GAW IP. Implication to become GAW station should proceed through registration in GAWSIS/OSCAR first with station MD then to submit data to DC. Archives will provide so called Extended/Advanced MD in xml. For CNs we can not enforce this but makes a lot of sense to operate this way with MD handling.

AV: One comment. The schema might be a bit more complicated. ICOS is CNs in Citrec. I worked to register stations in OSCAR which are part of ICOS and we plan to update our metadata directly in GAWSIS ourselves for example instrument information we will make live to have always synchronization between what we have in ICOS and GAWSIS. To get information in real time from stations and entered in GAWSIS using xml transfer of those MD. Now 34 stations are being registered in OSCAR - done. I had some difficulties, found some lack of clarity in how the interface is working and MD. Those stations are also TCCON stations and it is difficult to see in the interface if those are TCCON stations or ICOS, some PIs already registered some stations, we corrected some but those are not that obvious in the system.

JK: This is really weak and difficult spot. TCCON, ICOS, WDCGG all deal with GHG and most data and MD station are entered in WDCGG following desired procedures – Q is who updates the MD station in OSCAR and GAWSIS? We somehow need to agree amongst those involved. A technical solution could be made but this is a governance problem.

AV: It is both. If you have uniform MD in different parts of the system it should be one flow to go in OSCAR and not coming from WDC – information goes around and there will be differences from different systems which is something that should be avoided.

JK: How you would resolve this problem?

AV: there should be only one flow MD going in this system: basic information comes from PIs- they maintain the instrumentation they have a work Flow and provenance of data is kept in MD flow and ICOS portal keeps it in their then this information to flow to GAWSIS then to WDC and then WDCGG can be the source of MD for stations that are not in ICOS but for all MD but it needs to be decided who is responsible for what.

JK: It is governance question. Problem to agree between different players. We want to avoid – same observations are registered several times in OSCAR and found several times in various archives without knowing it is the same. We need to put down on flow diagram and see how this could be put into practice to have one source multiple synchronizations.

AV: From design criteria and architecture there should be one direction flow for one station for setting MD. Reading MD can be from multiple directions.

JK: this is exactly what I tried to present in my slides last time and Atsua picked some of that and presented in his slides. May be we are not there yet and we still need to agree for every single archive with MD submitted.

JW: to address the technical side of this question. I think there is larger hurdle in that area. The biggest challenge I have is to deal with older information which was entered years ago that was ported in OSCAR and WIGOS MD. Others also had experienced it. Fixing this is much harder than starting with new station and entering information. Can we consider onetime purge of atmospheric composition information and start on clean. This will be technical away easier to move forward if we have this case. We try to fix now and we try to enter new information and that is hard. My situation is not as difficult as TCCON and ICOS we just heard of.

JK: I am not opposed to removing wrong information from the system. If this turns to be THE WAY TO GO solution to have be able to assure up to date and comprehensive documentation and to be maintained we can consider this and do this.

JW: If we have a system decided on how to have flow and purged all information it will be easier to put all information in.

JK: Dietrich – where the interference between ICOS and TCCON comes from – from the couple of stations where we both have observations on same station and they both observe GHGs ?

AV: I identified CO2 for 2 stations but it is not clear in the current interface to distinguish. ICOS measures vertical profile of CO2, TCCON also. So we need to repeat CO2 for each elevation with same instrument. There is a distinction if it is column on other but needs to be repeated for each elevation in MD. Such details and operational stuff we probably need to discuss in other place – with people from WMDC.

JK: I propose that all groups who have a stake in GHG form a sub group to discuss this how to deal with it. I can participate or not. I can not resolve personally those things. Alex, Dietrich and Atsua should be part of this discussion. Drasko could support it. Have telecon and identify issues, couple examples from general issues and provide suggestions for concreate solutions.

AV: Reactive Gases people might also need to participate in.

JK: Those might be 2 different groups. We must have concrete plan and be sure it will work. We must make sure we have all accurate information to put in. Purging can be done in seconds. Other work needs to be done and it might need a little longer and we need to have plan for that and we are sure the result will be what we want it to be then we can consider. Technically we can do that.

AV: Lets do that.

DV: Updates from CNs. I will present personally input from SHADOZ as I was asked. Others can do present themselves. First is Dietrich.

DF: TCCON is ground-based FTIR Stations which are measuring column averaged GHG. Number stations are the same as in 2019. There is no central organization or funding whatsoever but we have common standards and define them in our community and the goal is to have highest precision column averaged observations of CH4, CO2, CO, N2O for cal/ver of respective satellite observations. We started this work 3 months ago. I am the contact and TCCON deputy chair , station PI and the TCCON chairs are very interested to have TCCON in OSCAR and I am the person to be contacted for OSCAR and I am already responsible for set up and maintenance generating xml MD to generate for TCOON for our DOIs. We try to do good in to fill metadata site. Current status- there is outdated MD for few stations in OSCAR/GAWSIS as MD is not maintained there. Those were entered when TCCON initially joined GAG years ago by CALTEC with 17 stations were at that time. Some of those closed mean time and number of others were set up. There will have to be some deletions. I personally agree that it is easier to deleted old entries and start from scratch and enter new ones especially when it is dealt on individual one to one station basis with 1 or small number of TCCON people and our stations. This is the approach we have with xml data for DOIs as it would have been a nightmare to do this with individual PIs. I looked at WIGOS MD standard and what is required to be provided. I though a large part of those MD is same networkwide or information from stations which is in our DOI MD but also part needs to be collected from PIs such as station contact info for sites which info will change or could be changing. In 5-6 years we have new release of processing software and we reprocess everything and update MD for DOIs and this will be the time to do this collection of information and fill all WIGOS fields. We are currently in process of rolling out next re-reprocessing update of DOIs and we set up tools to easier collect MD that are site specific from PIs. There is no web based tool for this. I made this tool but will update it to include all forms we need to ask for WIGOS MD that is not yet part of our DOI MD: Same process for DOI update will collect the MD needed for WIGOS and internal needs. Realistically to do this is the spring 2021. We will be updating our internal MD by December and od in several rounds. No quick solution but early 2021 is realistic. I need to know if WIGOS is a moving target or is set. If it is set and I can ask our PIs for those fields of information and it will be it or will change by the time we gather info.

JK: Most MD elements in OSCAR are time stamped contacts are not but are the most recently available. Most of information is time stamped. There is a history. If several stations are no in operation it does not mean that they should not show. They were historically there and will stay. This is explicitly in WMO WIGOS effort – record of observations. Analysis part of OSCAR to go back and show the situation of networks in the past and this adds additional layer of complexity.

DF: If MD of old stations does not interfere.

JK: It can not be moving target but standards evolve and need backward compatibility, with control process we are at the end of 1 version approved by WMO. Not going to change dramatically in context but how in is presented but for few year there will be backward compatibility

DF: Challenge is info I get from PIs if need to change.

KS: EANET:MD are sent to our data center when station is set up and when a piece of information changes. The template with information is prepared in MS Word and not machine readable and it is difficult to make this information automatically. We prepared a sample xml one station in Japan for inputting our MD information in OSCAR with Drasko based on one of methods he proposed. I am ready to send this file by e-mail and to confirm if this file works well. Updating information will take some time because of the programing part and because some parts are done manually. The other option is to have the Focal Point account inputting all information manually is more easy to do. We will try to use those two options in parallel and transfer all our MD. First I will prepare xml file for testing and see if it is working.

Issues I found in OSCAR: Wrong information on stations that is already in OSCAR in different fields of MD.

DV: we will continue exchanging information and files in e-mails to confirm those are working.

TN: SKYNET: We are trying to create MD and to modify our systems to generate the MD on more than 100 observation sites in Asia, India and Europe and is still expanding. It is organized regionally in subnetworks – responsible to operating and maintaining instruments. Raw data are transfer from each individual SNYNET data center RDC to international ISDC in Japan. There, products of AO properties which are approved by the international SKYNET committee are generated. Data processing system was old and new system is being used with more information/product that is generated.

GC: I was introduced to this MD while working on Variable naming requirements from NASA on my projects which are remote sensing. We implemented standard names attached to the variables. We have ingested more than 29000 files since October with more than 5000 variable names. We learn some lessons there. We also developed tools for ingesting data and archiving for 20-25 years. We have incorporated file format checking for 20 year and recently we included collecting metadata using the same tool expanding functionality dealing with 5-7 NASA field campaigns data simultaneously and Jeannette is leading us to develop tools for NDACC. All tolls we have are intended to have modern and wide functionality: archiving file, checking format requirements. Challenge we have is related with diversity of standardization. Question should we have standard variable name for pi variable which is everyday thing almost. Data integrity is also very important. If standard is not followed it takes a lot of time to fix things. We spend 80% of our time on format and 10% of data files. Uncertainty and how it is determined/reported properly is another issue and people need for certain applications the propagation of uncertainty is important. Data providers do not get resources for handling extended MD and its reporting which might be time consuming. I do not have deep understanding of WIGOS so we need help from Drasko. When we understand it we can do more and make our data ingest better for TOAR, NDACC.

EL: LALINET: Precipitation data but could move on more details. We started in 2015 by joining WMO and have agreement. We are a Latin American coordinated lidar network measuring aerosol backscatter coefficient and aerosol extinction profiles for climatological studies of the aerosol distribution over Latin America and few other species. Our objective is to establish consistent statistically sound data set for aerosols and for several stations O3 and other species for enhancement of the understanding of the aerosol distribution over the continent and its direct and indirect influence on climate. We work closely with MPLNET to consolidate the measurement and data acquisition protocols, establish a QA/QC routine among all stations. Our goal is to improve and establish an unified data analysis routine common to all stations, e.g., Single Calculus Chain. We work on creating a scientifically significant distributed database, e.g., lidar ratio, particle extinction, backscatter, angstrom exponents and particle depol. regional values that can be assimilated into air quality & forecast models and used also for validation missions. We have 3 main stations in Brazil. We measure volcanic ash. Some of our stations are in maintenance now. We still struggle to fins what is going. There is no centralized funding as in Europe and we struggle to have multigrain support and raising funds. 3 months ago I was approached by Drasko with respect to entering information in OSCAR. We have some annual meetings and it is distracting. We also do a lot of this individually. We have a lot to catch up but it is good opportunity to be here.

JW: MPLNET: This is my 3d update on this open action item. I will talk on the process, workflow and how I decided to this this as it could be useful for other CNs. I was I create an Json template similar to Yaml template Tom uses for WOUDC and ultimately make an xml file. At the end I will give a summary where I am at.

  1. Started with downloading stations already in OSCAR. Developed program to interact with OSCAR API search and upload functions - match extracted sites based on co-location within 1 km lat/lon. Identified 3 categories requiring different approach: match-affiliated, match-not-affiliated and not existing in OSCAR- to be created.

  2. Developed new JSON template to contain MPLNET metadata required in OSCAR which is designed to be used by other networks as well. It is further uploaded in as OSCAR xml but it is easier to be maintained in JSON rather than xml file. You need to go over all Code Lists and OSCAR requirements and document what does not exists in OSCAR. You need to Include the status of MPLNET-OSCAR or your program affiliation. What I do is modify my internal DB to actually contain all this information to fill in and maintain this template in cronjob – which is automated.

  3. Developed program to read JSON template and creates XML files for all MPLNET sites. It depends on if your site/station is affiliated in OSCAR or not. It determines the content of your xml file will be and how you will construct it. If a station is not currently in OSCAR yet I can easily create an xml file because all GML ID and all very specific stuff is new. If my station exist in OSCAR but it is not affiliated with my program I can easy to create xml after 4-5 months and I can upload it with upload API and modify the station. However, if the station exists in OSCAR and has already been affiliated with another program the creation of xml file is more complicated. You have to pay attention to GML IDs and how you want to edit it and I do not have time to explain all of that. After you make an xml file, what I do is save xml file and Jason template in web accessible directory for easy access on my server and also upload it to OSCAR using OSCAR API. This was work aimed to assist contributing networks interactions in OSCAR or not using API; to simplify collection of information needed to create and edit WMDR XML files using API and to include important information not already in WMDR.

Json Template has 3-4 sections but 2 are required - Header and Sites. Help is included and notes for different people with links where to find the information.

When it comes to variables you can save them as a list and they are mapped to the code list. For the program to read JSON template and create XML files for sites one of the biggest issues is paying attention to GML IDs. That is probably why people see duplicated items when they try to add something. Another issue is knowing which elements you can leave blank if you do not want to edit them and how to actually code that in xml file. This was painful process on try and error. Editing existing observation elements still has problems. Updating time/date periods in observation element does not work properly. Observing facility can not fit on one line and hard to follow. In summary if station exists in OSCAR, and is already affiliated then the XML file creation is very complicated. Right now 9 existing stations have out-of-date information that must be edited, and also many new observations must be added. Very difficult process figuring out how to edit existing information using the API. Summary with help and back and forth between me, Drasko, Jeor and Lucia: at moment only one remaining issue. A new version of OSCAR is being released with new logic and more GML IDs. How it will affect what is done? Description, details on the new elements and changes. There remain many problems and/or errors with existing OSCAR code lists and requirements that I have documented already with some are being addressed, but most appear not to be. Biggest problem by far remains lack of good help/tutorial system to use OSCAR and add/update metadata. Credential and permissions, Identities might be still not resolved. Who is the user of OSCAR and what are his requirements and what will happen if OSCAR crashes tomorrow?

JK: Thank you very much for your tremendous contribution Judd. You have spent a lot of time and effort and that is why it is of very high value to be documented. This is new global effort that takes time and at this stage with lots of pain. If OSCAR crashes then WDQMS WIGOS Monitoring System will also crashed. That is put in place to have an overview for the actual status of the observing stations and it is still not running for all observations and in particular for those we are interested in. For the weather community those 2 components are already KEY to the functioning of the global observing system used by network managers by developers, funding proposal etc.

GC: GAW stations info and this is information will be of more and more value for mission and campaign planning and execution in the future. We used TCCON stations in the past.

JK: Date time details are required by climatologists. Data archeology – details for data and observations. Data and metadata rescue what, where and what was observed – project exist now. People just learn their value – MD. To maintain long term effort. Model assessment and validation are also other users.


AI1: SN or CV to share the whole report on data QA-QC with the group for comments and additions

AI2: CV to create first draft GAW report on data QA-QC and then all will comment and edit along with provision of additional information

AI3: Collecting requirements forward for modifiers and variables and work to develop appropriate data model for how we can represent them (MF, GC, JW, JK)

AI4: AK reported for WDCGG problem that can only add information on station for Japanese stations – this is policy issue permissions. No technical issue. Need to have contact for each data set doing this work or to be given authorization. (SN, JK, OSCAR team, WIGOS management? Who needs to be engaged to resolve such issues)

AI5: All groups who have a stake in GHG to form a sub group to discuss how to deal with MD flow. Have telecon and identify issues, couple examples from general issues and provide suggestions for concreate solutions for GHG. (Alex, Dietrich and Atsua should be part of this discussion, Drasko could support it.)