Not logged in

    1NB Interact

    Features

    • Interact with your data with AI.
    • AI generates python code that is executed in an executer. The result is observed by the AI.
    • With workers, your data does not leave your computer.

    Choose executer


    executer features privacy user requirements tokens needed
    1nb executer Code executes in 1NB's server Data saved securely in 1NB None Compute + AI
    Self hosted executer Code executes in User's computer Data saved securely in 1NB Start a worker AI
    Worker Code executes in User's computer Data resides in User's computer Start a worker AI

    OneNB: CLI/Python API for 1NB

    Features

    • Use 1NB with your existing git repository and jupyter notebooks
    • Export data directly from your notebooks to 1nb:
      • from OneNB.exports import DataX
        with DataX.writer(".df") as wr:
            my_dataframe.to_pickle(wr)
        Python
    • Directly import data from a saved tag in 1nb into your jupyter notebooks
      • from OneNB.i.myrepo.tag import MyData
        data_df = pd.read_csv(MyData.reader())
        Python
    • Import data from other notebooks you are working on (pipelining):
      • from OneNB.working import DataX
        Python
    • Worker will automatically rerun notebooks (or new interacts) in your local computer. So you can use your data without it leaving your system
    • And many more

    Installation

    Supported Python version: >= 3.10

    Linux/Mac

    Install the wheel (download) with pip (preferably in a virtual env):

    python3 -m venv ~/.1nbvenv
    source ~/.1nbvenv/bin/activate
    python3 -m pip install OneNB-X.X.X-py3-none-any.whl
    Cmd

    If you have sudo permission, make 1nb command accessible from anywhere:

    sudo ln -s ~/.1nbvenv/bin/1nb /usr/local/bin/1nb
    Cmd

    Otherwise, activate the environment when running 1nb commands:

    source ~/.1nbvenv/bin/activate
    1nb init
    Cmd

    Windows

    Warning: the worker function does not currently run in windows.

    Make sure that the python version is at least 3.10: python --version

    Install the wheel (download) with pip (preferably in a virtual env):

    cd C:\path\to\home\
    python -m venv .1nbvenv
    .1nbvenv\Scripts\activate
    python -m pip install C:\path\to\downloads\OneNB-X.X.X-py3-none-any.whl
    Cmd

    When running 1nb commands activate the env:

    C:\path\to\home\.1nbvenv\Scripts\activate
    1nb init
    Cmd

    Create a repository

    To be able to submit code & data to your 1nb repository, you will first need to create a repository locally. Since 1nb heavily uses version control with Git, you can only create a 1nb repo in a Git repository.

    git init
    1nb init --name repository name --help
    Cmd

    Executing this will setup the necessary 1nb files (in .1nb). It will also add those files to the git changeset. So make sure to git commit after that.

    Choose your data storage

    Your data (exports) can be stored in one of several ways:


    type provider options Stores in Local executer 1NB executer Command
    Local Local machine yes no 1nb init local
    Cloud AWS S3 1NB's secure data store yes yes 1nb init cloud s3
    Cloud AWS S3 Bucket name User's S3 bucket yes no 1nb init cloud --bucket <bucket> s3
    Cloud Azure Bucket name User's Azure data store yes no 1nb init cloud --bucket <bucket> azure
    Cloud GCP Bucket name User's GCP data store yes no 1nb init cloud --bucket <bucket> gcp

    Use 1nb init --help to access more options.

    • Local
      • Data is stored only on your local system.
    • Cloud (aws s3/azure/gcs)
      • Data is stored in the cloud (azure and gcs are not functional yet).
      • Cloud credentials will be accessed from the current system (will not be saved).
      • You can specify your own bucket and a key prefix with --bucket and --key-prefix
      • If --bucket is not set, data will be saved securely in 1nb's remote storage.
      • You can provide an --profile aws profile in case a non-default profile of aws credentials is to be used.

    Executer is needed to run the Interact (New). If 1NB is chosen, code will be executed in 1nb's server. local worker will execute the code in a user's system.

    Additional options

    • --requirements-file Packages needed by your python project (other than OneNB). These packages will be automatically installed in a virtual env before running the project notebooks.
    • --profile AWS profile from which credentials will be used if cloud S3 storage is used.
    1nb init --name repository name --requirements-file pip requirements file
    Cmd

    Local code

    Python modules (.py) files are not saved in 1nb. If your notebooks depend on it, the code (as well as any additional metadata) needs to be packaged in setuptools style with pyproject.toml.

    This will make the package pip installable.

    Finally add the package path to your requirements file above: ./my/project/path

    Prepare a Notebook

    Saving data

    from OneNB.exports import MyData
    Python

    This code creates a data export named MyData. Writing data to it is similar to using the open syntax for saving binary data to disk:

    with MyData.open(suffix=".npy") as out:
        np.save(array, out)
    Python

    More examples:

    import pickle
    from OneNB.exports import labels
    
    vocab = {...}
    
    with labels.open(suffix=".pkl", mode='wb') as fe:
        pickle.dump(vocab, fe)
    Python
    import pandas as pd
    from OneNB.exports import Sales
    
    out = pd.DataFrame(...)
    
    with Sales.open(suffix=".csv") as f:
        out.to_csv(f)
    Python

    Suffix is required to identify the type of data being saved. To save text data change the mode to text: .open(mode="t", suffix=".json").

    This code can be run from any script or notebook as many times. The name of the export MyData will be an unique identifier. So exporting with the same name from other locations will overwrite it.

    Parameterize your data & notebooks

    The parameters in a notebook should additionally be defined in a cell that is tagged parameters. This cell should be the first cell of the notebook:

    year = 2023
    foo = "bar"
    Jupyter
    Cell
    Tags: [parameters]

    Later:

    p1 = dict(year=year, foo=foo)
    from OneNB.exports._p1 import Report
    Jupyter
    Cell

    Defining the parameters in this way allows easily modifiying key assumptions while executing the notebook. The parameters values should be simple literals and not include objects.

    Also, if the data values are simple (boolean, string, number) then 1nb will allow the user to modify it in the web UI while executing it later.

    Warning: you should include all the variables in the parameters cell to the import definition too (p1). Failing to do so will result in errors in the push stage.

    Mark Results

    Often a notebook has some key cells that have the final result in them. It maybe some charts or text data. Tagging these cells as result will make the output from those cells easily viewable in the web ui. Code from these cells will not be shown.

    Save run

    After executing your notebook with a parameter set you should add it to 1nb:

    1nb addnb --nb notebook.ipynb
    Cmd

    This operation will identify the parameters in the notebook and save it for later upload. This operation needs to be done every time you have a notebook ready with a unique parameter set.

    In case this notebook is not the one that was originally executed (e.g. it was renamed or moved after execution) then please add the original one with --original. In that case --nb should be the notebook that was saved after execute:

    1nb addnb --nb notebook.ipynb --original original.notebook.ipynb
    Cmd

    Push

    Before your work can be uploaded, it first needs to be commited and tagged to git:

    git commit
    git tag reportsv1
    
    1nb push
    Cmd

    On performing 1nb push all the data exports and notebooks are going to be uploaded to the cloud. If this is the first push, use 1nb push --create instead. After this you should be able to view this repository in 1nb.ai

    Worker

    To use interact nb (with self hosted / keep data locally) or run nb (rerun notebooks) you will need to create a worker process. Install the python client and perform:

    1nb worker --repo repository
    Cmd

    If you are using interact nb + keep data locally, you will also need to provide the file path to the data:

    1nb worker --repo repository --use-data path
    Cmd

    If the repository does not already exist, use --create-repo after the repository name: --repo <repository> --create-repo.

    This feature is currently not supported in Windows.

    Reuse data

    Data can be loaded with syntax similar to exporting, but with the tag/commit info to identify the revision version of the code that exported it. It then needs to be read as usual in python, but using .reader() wherever a file handle is expected :

    from OneNB.i.repository.tag import MyData
    
    # example: read csv
    import pandas as pd
    my_data = pd.read_csv(MyData.reader())
    
    # example: read pickle
    import pickle
    my_data = pickle.load(MyData.reader())
    
    # parameterized data
    p = {"year": 2023, "foo": "bar"}
    from OneNB.i.repository.tag._p import Report
    Python

    Exporting data from such a script/notebook will produce a dependency on the exports MyData and Report. Viewing the export in the web UI will list these dependancies.

    Also, if this is a notebook and the dependant exports are also generated by notebooks, 1nb will automatically create a pipeline such that the dependant notebooks are rerun (if required) before reunning this notebook.

    In case the repository name has special characters so that python throws SyntaxError, replace those characters with an underscore (_).

    How it works

    • When you execute a jupyter notebook that imports/exports data from/to 1NB/1NB:
      • It identifies the path of the executing notebook
      • Fetches the data from 1NB if data is being imported from a saved tag
      • When data is being exported, it stores the binary data into an internal file
        • For local storage only the path to this internal file is uploaded to 1nb
      • Stores the import/export metadata to .1nb/exports.json
        • If notebook parameters are used, then it also associates the data with the current parameter dictionary
    • After executing and saving the notebook, use 1nb addnb
      • It saves a copy of the notebook to .1nb/
      • Allows you to re-execute the notebook while only changing the parameters cell.
      • On re-executing and repeating addnb, it will record all the executions with unique parameter sets.
    • On finalizing your work
      • Save your work to git: git commit, git tag, git push
      • Tagging the commit allows 1nb to associate the exported data with this version tag
      • It identifies the current git branch and parent git branches:
        • If a parent branch has a notebook with the same name, then the current notebook is recorded as a revision of the parent
      • It only uses the local branch name, not the remote
      • Push the notebooks and data (metadata) to 1nb push
    • On Push
      • Notebooks that export data are always uploaded to 1NB
      • Exported data is uploaded according to data storage
      • Local python modules (.py) are not uploaded by default to 1NB. Follow package local code so notebooks that depend on it will not fail on rerun/interact.
    • Worker
      • You start a worker with 1nb worker
      • It listens for active task from 1NB
      • It creates a virtualenv (or reuses existing ones)
      • Executes the incoming code and sends back the results
      • When the notebook is finalized, it will itself save and push the notebook back to 1NB

    API

    Data Exports

    class Exported(ModuleType, SaveableValue)
    Python

    Export data to 1nb. You can write to this like a file (only writing is supported) using the file open syntax:

    from OneNB.exports import exp
    
    with exp.open(mode="wt", suffix=".txt") as f:
        f.write("Hello world")
    Python

    Other writers: text_writer, writer, write_csv, write_as_json and write_stream

    open

    @contextmanager
    def open(*, suffix: str, mode: str = "wt", encoding: str = "utf-8")
    Python

    Writer similar to file open:

    with _object_.open(suffix=".txt") as f:
        f.write("Hello world")
    Python

    Arguments:

    • mode - 't' for text data, 'b' for binary data;
    • suffix - file name suffix to signify the file type;
    • encoding - byte encoding format if text data, default utf-8

    text_writer

    def text_writer(file_ext: str)
    Python

    A file like writer for text data. Shorthand for .open

    writer

    def writer(file_ext: str)
    Python

    A file like writer for binary data. Shorthand for .open

    write_csv

    def write_csv(data: list[list[str]])
    Python

    Write object (list[list[str]]) as csv with excel dialect

    write_as_json

    def write_as_json(data, cls: Type[json.JSONEncoder] | None = None)
    Python

    Write object as json, cls is same as in json.dump

    write_stream

    def write_stream(stream: io.BytesIO | io.StringIO, file_ext: str,
                     encoding: str)
    Python

    Write data directly from a IO buffer

    save_value

    def save_value(data: any, _special: any = None)
    Python

    Save value directly outside of data storage. Can save upto 100 bytes and supports only native python types.

    Note: this will be saved in 1NB, regardless of this repository's data storage settings.

    open_no_upload

    def open_no_upload(*,
                       fn_write_bytes_to: Optional[WriteToBuffer] = None,
                       fn_returning_fileobj: Optional[ReturnsReadable] = None,
                       source_type: str,
                       params: dict[str, bool | int | str | None])
    Python

    Directly load data from an external source, and store the source parameters.

    exp.open_no_upload(source_type="my_file", params=("/path/to/my/file",))
    
    Python
    s3_client = boto3.client("s3")
    
    def download_from_s3(buffer, bucket, key):
        s3_client.download_fileobj(bucket, key, buffer)
    
    exp.open_no_upload(source_type="my_s3", fn_write_bytes_to=download_from_s3, params=("my_bucket", "my_key"))
    Python

    Arguments:

    • fn_write_bytes_to optional - a function that writes the data into a buffer and the buffer is supplied as the first parameter
    • fn_returning_fileobj optional - a function that returns a readable file like object that points to the data
    • source_type - string for user reference
    • params - parameters passed to the function (or python's builtin open(file, ...) function)

    Data Import

    class Imported(ModuleType, SupportsBytes, LocalStorePaths)
    Python

    Buffered imported data. Get a file like readable object with .reader(). Use it wherever a readable file-like object can be used.

    from OneNB.i.repo.tag import MyData
    
    json.load(MyData.reader())
    Python

    If its a value, use: .value

    reader

    def reader()
    Python

    Get a readable binary file like object that can be used wherever a file like object can be used.

    pd.read_csv(MyData.reader())
    Python

    value

    @property
    def value()
    Python

    Get the value if it is stored with Exported.save_value or Exported.open_no_upload methods.