Class TextSplitterAbstract

Abstract base class for document transformation systems.

A document transformation system takes an array of Documents and returns an array of transformed Documents. These arrays do not necessarily have to have the same length.

One example of this is a text splitter that splits a large document into many smaller documents.

Hierarchy

Implements

Constructors

Properties

chunkOverlap: number = 200
chunkSize: number = 1000
keepSeparator: boolean = false
lc_kwargs: SerializedFields
lc_namespace: string[] = ...

A path to the module that contains the class, eg. ["langchain", "llms"] Usually should be the same as the entrypoint the class is exported from.

lc_serializable: boolean = false
lengthFunction: ((text) => number) | ((text) => Promise<number>)

Type declaration

    • (text): number
    • Parameters

      • text: string

      Returns number

Type declaration

    • (text): Promise<number>
    • Parameters

      • text: string

      Returns Promise<number>

lc_runnable: boolean = true

Accessors

  • get lc_aliases(): undefined | {
        [key: string]: string;
    }
  • A map of aliases for constructor args. Keys are the attribute names, e.g. "foo". Values are the alias that will replace the key in serialization. This is used to eg. make argument names match Python.

    Returns undefined | {
        [key: string]: string;
    }

  • get lc_attributes(): undefined | SerializedFields
  • A map of additional attributes to merge with constructor args. Keys are the attribute names, e.g. "foo". Values are the attribute values, which will be serialized. These attributes need to be accepted by the constructor as arguments.

    Returns undefined | SerializedFields

  • get lc_secrets(): undefined | {
        [key: string]: string;
    }
  • A map of secrets, which will be omitted from serialization. Keys are paths to the secret in constructor args, e.g. "foo.bar.baz". Values are the secret ids, which will be used when deserializing.

    Returns undefined | {
        [key: string]: string;
    }

Methods

  • Internal method that handles batching and configuration for a runnable It takes a function, input values, and optional configuration, and returns a promise that resolves to the output values.

    Type Parameters

    • T extends Document<Record<string, any>>[]

    Parameters

    Returns Promise<(Error | Document<Record<string, any>>[])[]>

    A promise that resolves to the output values.

  • Method to invoke the document transformation. This method calls the transformDocuments method with the provided input.

    Parameters

    • input: Document<Record<string, any>>[]

      The input documents to be transformed.

    • Optional _options: Partial<BaseCallbackConfig>

      Optional configuration object to customize the behavior of callbacks.

    Returns Promise<Document<Record<string, any>>[]>

    A Promise that resolves to the transformed documents.

  • Stream all output from a runnable, as reported to the callback system. This includes all inner runs of LLMs, Retrievers, Tools, etc. Output is streamed as Log objects, which include a list of jsonpatch ops that describe how the state of the run has changed in each step, and the final state of the run. The jsonpatch ops can be applied in order to construct state.

    Parameters

    • input: Document<Record<string, any>>[]
    • Optional options: Partial<BaseCallbackConfig>
    • Optional streamOptions: Omit<LogStreamCallbackHandlerInput, "autoClose">

    Returns AsyncGenerator<RunLogPatch, any, unknown>

  • Default implementation of transform, which buffers input and then calls stream. Subclasses should override this method if they can start producing output while input is still being generated.

    Parameters

    Returns AsyncGenerator<Document<Record<string, any>>[], any, unknown>

  • Helper method to transform an Iterator of Input values into an Iterator of Output values, with callbacks. Use this to implement stream() or transform() in Runnable subclasses.

    Type Parameters

    • I extends Document<Record<string, any>>[]

    • O extends Document<Record<string, any>>[]

    Parameters

    • inputGenerator: AsyncGenerator<I, any, unknown>
    • transformer: ((generator, runManager?, options?) => AsyncGenerator<O, any, unknown>)
        • (generator, runManager?, options?): AsyncGenerator<O, any, unknown>
        • Parameters

          Returns AsyncGenerator<O, any, unknown>

    • Optional options: BaseCallbackConfig & {
          runType?: string;
      }

    Returns AsyncGenerator<O, any, unknown>

Generated using TypeDoc