Call minutes - ML-Schema/core GitHub Wiki
Minutes from the ML Schema call March 5, 2018
ACTIONS:
Journal paper
- we will make a shortlist of the journals to consider for publication of the paper on ML-Schema and contact their editors inquiring whether they would be interested in such a paper before submission
- two submission types to consider:
- position paper: describing our vision
- research paper: we discuss the developed mapping and show applications We should highlight what are the added values of our alignment, one that was discussed was that it should facilitate exchanging data between applications Withing next two weeks we should have some overview of possibilites (@Larisa and @Joaquin are contacting the editors of selected journals, more could be considered)
ML-Schema extensions
- ML-Schema extensions to consider: deep learning, unsupervised learning, streaming data We should start from deep learning (model ML-Schema deep learning extension). For that, we need to collect requirements, for instance, taking into account:
- Amazon’s schema (focusing on feature provenance)
- the experiments at OpenML (e.g. using Keras, and going deeper into representing models) The next meeting will be devoted to this topic.
- @Tommaso - will start a survey on deep learning model exchange
The next call will be on March 19, 2018 5pm CET.
Minutes from the ML Schema call May 8, 2017
Actions:
-
OpenML use case: a) @Joaquin will meet Tuesday 16 with @Tommaso to push forward calling a script of Tommaso (is not integrated yet to the OpenML platform) b) @Agnieszka will check whether there is a mapping of ML Schema to Open ML (should be used both)
-
All: put related papers to the reproducible research Mendeley group created by @Diego
-
@Agnieszka: will talk about use cases in scientific publishing during ESWC 2017
-
@Agnieszka: to talk to editors of the Semantic Web Journal
-
@Agnieszka: will check possible uses cases related to Research Objects
-
@Larisa: to contact Data Science journal editors
Suggestions:
- @Larisa: maybe form an Advisory Board and ask them for endorsement
- @Diego suggests to setup a deadline for producing a tool for transforming between schemas (till August?), we need tools
The next call is May, 22 2017
Minutes from the ML Schema call April 24, 2017
Actions:
- @Agnieszka will ask about planned publication of the WOP 2016 chapter
- @Larisa will contact Springer with regard to promote the schema
- Amazon released a schema https://github.com/awslabs/ml-experiments-schema and @Joaquin is in contact with the people involved, and will arrange a call to discuss ML-Schema and Amazon’s schema together with Amazon (tentative date: June 6th)
- @Joaquin will coordinate with Tommaso further steps regarding application of ML-Schema to Open ML
The next call is May, 8 2017
Minutes from the ML Schema call November 21, 2016
Decisions:
- We will set up the next call around mid January
- One major topic of this planned January call will be modeling deep learning
- We will issue call for use cases by then.
- There is an initial plan for OpenML hackathon in March which could tackle the topic of modeling deep learning
- We should also think on data mining (DM) community not only machine learning (ML) community. The former one may be more interested in knowledge based approaches, metadata and schemas and might be closer to cross-domain use cases. Cross-domain uses cases (beyond pure DM/ML) may require more metadata interoperability.
Minutes from the ML Schema call November 7, 2016
Decisions with regard to ways to move forward:
-
good „position” publication, possibly a journal paper in a machine learning journal, proposals where and how (who should be involved) to publish are welcome
-
use cases: issue a call for use cases and implementations; also collect (possible) use cases among ourselves. They may be identified in our own current and future projects. They will also provide feedback for possible extensions of the schema. Note: some use case types we have identified may be found here [1]. This list may be expanded, but we also need specific use cases falling within some of these types.
[1] https://github.com/ML-Schema/core/wiki/UseCases
Minutes from the ML Schema call September 26, 2016
Decisions: The documentation of the ML Schema finalized within a week, then (after polishing) released to the W3C mailing list before ISWC2016. @Agnieszka will integrate last bits and pieces, including the material from @Pance and then asking everybody for proof-reading. @Tommaso - to check whether openml2rdf code could be also released shortly.
Minutes from the ML Schema call August 15, 2016
Decisions: We will finalize the draft documentation by end August.
@Diego:
- Documentation - Introduction:
- to add the text on the motivation for ML Schema, e.g. to align existing ontologies and schemas, that is why we propose only highlevel, lightweight model
- to also motivate by the need for reproducible research
- Documentation - linking to other resources:
- to make the first draft on how the proposed schema is complaint with other resources and can be used together with other ontologies and resources to provide more detailed information, e.g. with: DM/ML ontologies and schemas, PROV For both points, the text from WOP2016 paper can be reused. For 2) @Pance will follow?
@Joaquin:
- To better structure Section 2: *give a general description of the main concepts, very short with hyperlinks to the full class description. Before showing the figure. *then, make a separate (sub)section, and say from the start that the example is based on an OpenML Run, with hyperlinks to the website with online run description.
@Agnieszka:
- Integrate missing bits and pieces:
- put the schema into the http://www.w3.org/ns/mls workspace so it is dereferenceable
- cite LODE tool on the documentation website
- check & update example file (MLSchemaSandbox.ttl) so it is consistent with the documentation file, provide a link for downloading the example?
- provide information on what tools to use to open example(s)
- Integrate @Larisa’s example describing Study into the documentation.
Minutes from the ML Schema call July 4, 2016 Actions & plans:
- Put the schema into the http://www.w3.org/ns/mls workspace so it is dereferenceable
- Cite LODE tool on the documentation website
- Check & update example file (MLSchemaSandbox.ttl) so it is consistent with the documentation file, provide a link for downloading the example?
- Provide information on what tools to use to open example(s)
- A plan for position paper / statement paper. Where to publish as a journal paper? (community type of submission). Before that finish and disseminate the draft report.
- @Larisa works on the second, complementary example (focused more on experiments and studies and provenance) (the first example is focused more on runs)
Minutes from the ML Schema call June 20, 2016
Decisions: We have agreed that it would be good to try to finalize the draft documentation before most of us are gone for holidays. We need to make clear in the documentation: What it is for? (maybe link to the goals of our group and use cases), Why? (again look into the goals of the group such as to align existing schemas, to avoid proliferation of very similar resources), And how can ML Schema be used? (Provide examples and how to link to other resources). Some issues identified to achieve this:
- Documentation - Introduction:
- to add the text on the motivation for ML Schema, e.g. to align existing ontologies and schemas (cite them in the references?), that is why we propose only highlevel, lightweight model
- to also motivate by the need for reproducible research
- to add something on the Audience („This document is mainly addressed to ML researchers / practisioners,..”, „for them to accomplish specific goals...” etc.)
- Documentation - linking to other resources:
- to describe (in the section after introduction of the core model), that this proposed schema is complaint with other resources and can be used together with other ontologies and resources to provide more detailed information, e.g. with: DM/ML ontologies and schemas, software ontologies, PROV, Investigation-Study-Assay, Datatype ontologies etc.
- Documentation - other:
- to cite OpenML and say that the example is derived from OpenML
- add to the current example the information on the task type (that :task29 is of type ClassificationTask). Add this to the text and also add this to the example code in turtle (with another namespace outside ML Schema core).
- explain what this schema in NOT meant for? (e.g., it is not meant to be a comprehensive ontology of ML which is going to replace existing models - those are already quite comprehensive, having various goals and are not going to be replaced)
- Incorporate the information on the envisaged use cases? They are listed on the Wiki of the group.
The way we are going to proceed: @Diego will be working on the points 1) and 2) until the end of the week (or next Monday the latest) and later on he will pass it to the next person who will pass it to @Larisa on July 1st.
Other news:
- @Tommaso and @Joaquin are working hard on openML2rdf code which is close to being complete.
Minutes from the ML Schema call June 6, 2016
Decisions:
- Documentation: extend the text, finish UML-based figure, add another figure with an example concerning instances; Should we mention use cases (OpenML)?
- Properties: better not to restrict domain and ranges (or not too much) in order to not prevent interoperability and re-use (see for instance the recent talk of Michael Uschold at Know@LOD2016)
- To release quickly a draft document and a call for feedback to the W3C mailing list and some other mailing lists (semantic web, machine learning)
- Create Twitter account? (also for disseminating the news on draft model)
- The importance of tools; To check openML2rdf code (@Joaquin); Converter from RDF to JSON-LD and back available from OpenML
- another use case (@Pance?)
Minutes from the ML Schema call May 23, 2016
Decisions:
- to add a link between Run and HyperParameterSetting to reflect that HyperParameterSetting is an input to Run (OpenML follows similar modeling) --> Issue #19
- to make optional that Task is "definedOn some evaluation specification"; instead reverse this relation and make a link from EvalutionSpecification to Task (e.g.: isAssociatedWith) --> Issue #20
- feature & data (issue #15): do no more than there is now and leave it as it is now
- to make optional that Run "has output some model evaluation" --> Issue #21 - DONE
- ML software (Issue #18): yes, it is good to incorporate it; at the end we can simply add a class Software and use an external vocabulary / ontology (such as software ontology SWO or https://en.wikipedia.org/wiki/DOAP) to model whatever is needed such as a library name (scikit-learn, RapidMiner, ...), a version, a project name etc.
- use gh-page so it is more convenient to view the documentation --> DONE
- to decide what to do further with the diagram
- to create an issue on whether (and how) we should model relations' domains and ranges --> Issue #22