Ten Language https://github.com/lukehoban/ten Ten is a statically typed tensor p...

Ten Language

Ten is a statically typed tensor programming language for defining AI models.

Ten has the following features: (1) Succint syntax and operators tailored to AI model definition (2) Fully statically typed tensors, including generic functions over tensor dimension and batch dimensions (...) (3) First-class hyper-parameters, model parameters and model arguments for explicit model specification (4) EinOps-style reshaping and reductions - tensor dimensions are explicit not implicit

Example (a functional GPT2 implementation):

    Gelu(x: {...}) -> {...}:
        return 0.5 * x * (1 + Tanh(0.7978845608 * x + 0.044715 * x**3))    

    SoftMax[N](x: {...,N}) -> {...,N}:
        exp_x = Exp(x - Max(x))
        return exp_x / Sum(exp_x)    

    LayerNorm[S,E]|g:{E},b:{E}|(x:{S,E}) -> {S,E}:
        mean = Mean(x)
        variance = Var(x)
        return g * (x - mean) / Sqrt(variance + 1e-5) + b    

    Linear[N,K]|w:{N,K},b:{K}|(x:{...,N}) -> {...K}:
        return x@w + b    

    FFN[S,E]|c_fc, c_proj|(x:{S,E}) -> {S,E}:
        a = Gelu(Linear[E,E*4]|c_fc|(x))
        return Linear[E*4,E]|c_proj|(a)    

    Attention[Q,K,N,V](q:{...,Q,K}, k:{...,N,K}, v:{...,N,V}, mask:{Q,N}) -> {...,Q,V}:
        return Softmax[N](q @ Transpose[N,K](k) / Sqrt(K) + mask) @ v    

    MHA[H,S,E,K]|c_attn, c_proj|(x:{S,E}) -> {S,E}:
        q, k, v = Linear[E,E*3]|c_attn|(x) {S,(3,H,K) -> 3,H,S,K}
        causal_mask = (Tri[S]() - 1) * 1e10
        out = Attention[S,K,S,K](q, k, v, causal_mask) {H,S,K -> S,(H,K)}   
        return Linear[E,E]|c_proj|(out)    

    Transformer[H,S,E]|mlp, attn, ln_1, ln_2|(x:{S,E}) -> {S, E}:
        y = x + MHA[H,S,E,E/H]|attn|(LayerNorm[S,E]|ln_1|(x))
        return y + FFN[S,E]|mlp|(LayerNorm[S,E]|ln_2|(y))    

    GPT2[H,S,E,B,V]|wte, wpe, blocks|(inputs:{S}) -> {S,V}:
        x = wte.[inputs] + wpe.[Range[S]()]
        z = for i in 0...B: x, y -> Transformer[H,S,E]|blocks.[i]|(y)
        return LayerNorm[S,E]|ln_f|(z) @ Transpose[V,E](wte)

Status: Working prototype, but lots more I'd love to do to bring this to life (README has more details of future thoughts/plans).