0 / 0
Creating your own models
Creating your own models

Creating your own models

Certain algorithms in Watson Natural Language Processing can be trained with your own data, for example you can create custom models based on your own data for entity extraction and to classify data.

Language support for custom models

You can create custom models and use the following pretrained dictionary and classification models for the shown languages. For a list of the language codes and the corresponding languages, see Language codes.

Supported languages for out-of-the-box custom models
Custom model Supported language codes
Dictionary models af, ar, bs, ca, cs, da, de, el, en, es, fi, fr, he, hi, hr, it, ja, ko, nb, nl, nn, pl, pt, ro, ru, sk, sr, sv, tr, zh_cn, zh_tw (all languages supported in the Syntax part of speech tagging)
Regexes af, ar, bs, ca, cs, da, de, el, en, es, fi, fr, he, hi, hr, it, ja, ko, nb, nl, nn, pl, pt, ro, ru, sk, sr, sv, tr, zh_cn, zh_tw (all languages supported in the Syntax part of speech tagging)
SVM classification with TFIDF af, ar, ca, cs, da, de, el, en, es, fi, fr, he, hi, hr, it, ja, ko, nb, nl, nn, pl, pt, ro, ru, sk, sr, sv, tr, zh_cn, zh_tw
SVM classification with USE ar, de, en, es, fr, it, ja, ko, nl, pl, pt, ru, tr, zh_cn, zh_tw
CNN classification with GloVe ar, de, en, es, fr, it, ja, ko, nl, pt, zh_cn
BERT Multilingual classification af, ar, ca, cs, da, de, el, en, es, fi, fr, he, hi, hr, it, ja, ko, nb, nl, nn, pl, pt, ro, ru, sk, sr, sv, tr, zh_cn, zh_tw
Stopword lists ar, de, en, es, fr, it, ja, ko

Saving and loading custom models

If you want to use your custom model in another notebook, save it as a Data Asset to your project. This way, you can export the model as part of a project export.

Use the project-lib library to save and load custom models.

To save a custom model in your notebook as a data asset to export and use in another project:

  1. Ensure that you have an access token on the Access control page on the Manage tab of your project. Only project admins can create access tokens. The access token can have viewer or editor access permissions. Only editors can inject the token into a notebook.
  2. Add the project token to a notebook by clicking More > Insert project token from the notebook action bar and then run the cell.

    By running the inserted hidden code cell, a project object is created that you can use for functions in the project-lib library. For details on the available project-lib functions, see Using project-lib for Python.

  3. Run the train() method to create a custom dictionary, regular expression, or classification model and assign this custom model to a variable. For example:

     custom_block = CNN.train(train_stream, embedding_model.embedding, verbose=2)
    
  4. Save the model as a Data Asset to your project using project_lib:

     project.save_data("<model name>", custom_block.as_file_like_object(), overwrite=True)
    

To load a custom model to a notebook that was imported from another project:

  1. Ensure that you have an access token on the Access control page on the Manage tab of your project. Only project admins can create access tokens. The access token can have viewer or editor access permissions. Only editors can inject the token into a notebook.
  2. Add the project token to a notebook by clicking More > Insert project token from the notebook action bar and then run the cell.

    By running the inserted hidden code cell, a project object is created that you can use for functions in the project-lib library. For details on the available project-lib functions, see Using project-lib for Python.

  3. Load the model using project-lib and watson-nlp:
     custom_block = watson_nlp.load(project.get_file("<model name>"))
    

Parent topic: Watson Natural language Processing library