Ten is a statically typed tensor programming language for defining AI models.
Ten has the following features: (1) Succint syntax and operators tailored to AI model definition (2) Fully statically typed tensors, including generic functions over tensor dimension and batch dimensions (...) (3) First-class hyper-parameters, model parameters and model arguments for explicit model specification (4) EinOps-style reshaping and reductions - tensor dimensions are explicit not implicit
Example (a functional GPT2 implementation):
Gelu(x: {...}) -> {...}:
return 0.5 * x * (1 + Tanh(0.7978845608 * x + 0.044715 * x**3))
SoftMax[N](x: {...,N}) -> {...,N}:
exp_x = Exp(x - Max(x))
return exp_x / Sum(exp_x)
LayerNorm[S,E]|g:{E},b:{E}|(x:{S,E}) -> {S,E}:
mean = Mean(x)
variance = Var(x)
return g * (x - mean) / Sqrt(variance + 1e-5) + b
Linear[N,K]|w:{N,K},b:{K}|(x:{...,N}) -> {...K}:
return x@w + b
FFN[S,E]|c_fc, c_proj|(x:{S,E}) -> {S,E}:
a = Gelu(Linear[E,E*4]|c_fc|(x))
return Linear[E*4,E]|c_proj|(a)
Attention[Q,K,N,V](q:{...,Q,K}, k:{...,N,K}, v:{...,N,V}, mask:{Q,N}) -> {...,Q,V}:
return Softmax[N](q @ Transpose[N,K](k) / Sqrt(K) + mask) @ v
MHA[H,S,E,K]|c_attn, c_proj|(x:{S,E}) -> {S,E}:
q, k, v = Linear[E,E*3]|c_attn|(x) {S,(3,H,K) -> 3,H,S,K}
causal_mask = (Tri[S]() - 1) * 1e10
out = Attention[S,K,S,K](q, k, v, causal_mask) {H,S,K -> S,(H,K)}
return Linear[E,E]|c_proj|(out)
Transformer[H,S,E]|mlp, attn, ln_1, ln_2|(x:{S,E}) -> {S, E}:
y = x + MHA[H,S,E,E/H]|attn|(LayerNorm[S,E]|ln_1|(x))
return y + FFN[S,E]|mlp|(LayerNorm[S,E]|ln_2|(y))
GPT2[H,S,E,B,V]|wte, wpe, blocks|(inputs:{S}) -> {S,V}:
x = wte.[inputs] + wpe.[Range[S]()]
z = for i in 0...B: x, y -> Transformer[H,S,E]|blocks.[i]|(y)
return LayerNorm[S,E]|ln_f|(z) @ Transpose[V,E](wte)
Status: Working prototype, but lots more I'd love to do to bring this to life (README has more details of future thoughts/plans).
https://github.com/lukehoban/ten
Ten is a statically typed tensor programming language for defining AI models.
Ten has the following features: (1) Succint syntax and operators tailored to AI model definition (2) Fully statically typed tensors, including generic functions over tensor dimension and batch dimensions (...) (3) First-class hyper-parameters, model parameters and model arguments for explicit model specification (4) EinOps-style reshaping and reductions - tensor dimensions are explicit not implicit
Example (a functional GPT2 implementation):
Status: Working prototype, but lots more I'd love to do to bring this to life (README has more details of future thoughts/plans).