AI Chat Got

AI Chat Got — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Textual case-based reasoning

    Textual case-based reasoning

    Textual case-based reasoning (TCBR) is a subtopic of case-based reasoning, in short CBR, a popular area in artificial intelligence. CBR suggests the ways to use past experiences to solve future similar problems, requiring that past experiences be structured in a form similar to attribute-value pairs. This leads to the investigation of textual descriptions for knowledge exploration whose output will be, in turn, used to solve similar problems. == Subareas == Textual case-base reasoning research has focused on: measuring similarity between textual cases mapping texts into structured case representations adapting textual cases for reuse automatically generating representations.

    Read more →
  • AI Paragraph Rewriters Reviews: What Actually Works in 2026

    AI Paragraph Rewriters Reviews: What Actually Works in 2026

    Looking for the best AI paragraph rewriter? An AI paragraph rewriter is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI paragraph rewriter slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • The Best Free AI Background Remover for Beginners

    The Best Free AI Background Remover for Beginners

    In search of the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Top 10 AI Code Generators Compared (2026)

    Top 10 AI Code Generators Compared (2026)

    Curious about the best AI code generator? An AI code generator is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI code generator slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • MY F.C.

    MY F.C.

    MY F.C. is a freemium app designed to organise and administer football teams. It is developed by MY F.C. Limited, a private company headquartered in Auckland, New Zealand. The app allows users to build a team by adding players and from there they can create trainings and matches, keep up with relevant news in the curated newsfeed, record statistics both individually and team based, follow the games live in the match-centre. The app also features integrated lineup builder with custom team kits. == History == Founders Sam Jenkins, Mike Simpson and Sam Jasper started MY F.C. in 2015 to help them "run their football lives". The app was launched on Android and iOS on 14 February 2017. == Accolades == MY F.C. won the first place prize at Bank of New Zealand Start-up Alley 2017 competition that aims to discover New Zealand start-ups who are doing innovative work and ready to establish themselves as long-term, sustainable businesses. The prize package included $15,000 and a trip to San Francisco.

    Read more →
  • Arthur Zimek

    Arthur Zimek

    Arthur Zimek is a professor in data mining, data science and machine learning at the University of Southern Denmark in Odense, Denmark. He graduated from LMU Munich in Germany, where he worked with Prof. Hans-Peter Kriegel. His dissertation on "Correlation Clustering" was awarded the "SIGKDD Doctoral Dissertation Award 2009 Runner-up" by the Association for Computing Machinery. He is well known for his work on outlier detection, density-based clustering, correlation clustering, and the curse of dimensionality. He is one of the founders and core developers of the open-source ELKI data mining framework.

    Read more →
  • Top 10 AI Website Builders Compared (2026)

    Top 10 AI Website Builders Compared (2026)

    Curious about the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Trigram tagger

    Trigram tagger

    In computational linguistics, a trigram tagger is a statistical method for automatically identifying words as being nouns, verbs, adjectives, adverbs, etc. based on second order Markov models that consider triples of consecutive words. It is trained on a text corpus as a method to predict the next word, taking the product of the probabilities of unigram, bigram and trigram. In speech recognition, algorithms utilizing trigram-tagger score better than those algorithms utilizing IIMM tagger but less well than Net tagger. The description of the trigram tagger is provided by Brants (2000).

    Read more →
  • Speech segmentation

    Speech segmentation

    Speech segmentation is the process of identifying the boundaries between words, syllables, or phonemes in spoken natural languages. The term applies both to the mental processes used by humans, and to artificial processes of natural language processing. In the field of automatic pronunciation assessment, the process of segmenting an utterance against expected word(s) is called forced alignment. Speech segmentation is a subfield of general speech perception and an important subproblem of the technologically focused field of speech recognition, and cannot be adequately solved in isolation. As in most natural language processing problems, one must take into account context, grammar, and semantics, and even so the result is often a probabilistic division (statistically based on likelihood) rather than a categorical one. Though it seems that coarticulation—a phenomenon which may happen between adjacent words just as easily as within a single word—presents the main challenge in speech segmentation across languages, some other problems and strategies employed in solving those problems can be seen in the following sections. This problem overlaps to some extent with the problem of text segmentation that occurs in some languages which are traditionally written without inter-word spaces, like Chinese and Japanese, compared to writing systems which indicate speech segmentation between words by a word divider, such as the space. However, even for those languages, text segmentation is often much easier than speech segmentation, because the written language usually has little interference between adjacent words, and often contains additional clues not present in speech (such as the use of Chinese characters for word stems in Japanese). == Lexical recognition == In natural languages, the meaning of a complex spoken sentence can be understood by decomposing it into smaller lexical segments (roughly, the words of the language), associating a meaning to each segment, and combining those meanings according to the grammar rules of the language. Though lexical recognition is not thought to be used by infants in their first year, due to their highly limited vocabularies, it is one of the major processes involved in speech segmentation for adults. Three main models of lexical recognition exist in current research: first, whole-word access, which argues that words have a whole-word representation in the lexicon; second, decomposition, which argues that morphologically complex words are broken down into their morphemes (roots, stems, inflections, etc.) and then interpreted and; third, the view that whole-word and decomposition models are both used, but that the whole-word model provides some computational advantages and is therefore dominant in lexical recognition. To give an example, in a whole-word model, the word "cats" might be stored and searched for by letter, first "c", then "ca", "cat", and finally "cats". The same word, in a decompositional model, would likely be stored under the root word "cat" and could be searched for after removing the "s" suffix. "Falling", similarly, would be stored as "fall" and suffixed with the "ing" inflection. Though proponents of the decompositional model recognize that a morpheme-by-morpheme analysis may require significantly more computation, they argue that the unpacking of morphological information is necessary for other processes (such as syntactic structure) which may occur parallel to lexical searches. As a whole, research into systems of human lexical recognition is limited due to little experimental evidence that fully discriminates between the three main models. In any case, lexical recognition likely contributes significantly to speech segmentation through the contextual clues it provides, given that it is a heavily probabilistic system—based on the statistical likelihood of certain words or constituents occurring together. For example, one can imagine a situation where a person might say "I bought my dog at a ____ shop" and the missing word's vowel is pronounced as in "net", "sweat", or "pet". While the probability of "netshop" is extremely low, since "netshop" isn't currently a compound or phrase in English, and "sweatshop" also seems contextually improbable, "pet shop" is a good fit because it is a common phrase and is also related to the word "dog". Moreover, an utterance can have different meanings depending on how it is split into words. A popular example, often quoted in the field, is the phrase "How to wreck a nice beach", which sounds very similar to "How to recognize speech". As this example shows, proper lexical segmentation depends on context and semantics which draws on the whole of human knowledge and experience, and would thus require advanced pattern recognition and artificial intelligence technologies to be implemented on a computer. Lexical recognition is of particular value in the field of computer speech recognition, since the ability to build and search a network of semantically connected ideas would greatly increase the effectiveness of speech-recognition software. Statistical models can be used to segment and align recorded speech to words or phones. Applications include automatic lip-synch timing for cartoon animation, follow-the-bouncing-ball video sub-titling, and linguistic research. Automatic segmentation and alignment software is commercially available. == Phonotactic cues == For most spoken languages, the boundaries between lexical units are difficult to identify; phonotactics are one answer to this issue. One might expect that the inter-word spaces used by many written languages like English or Spanish would correspond to pauses in their spoken version, but that is true only in very slow speech, when the speaker deliberately inserts those pauses. In normal speech, one typically finds many consecutive words being said with no pauses between them, and often the final sounds of one word blend smoothly or fuse with the initial sounds of the next word. The notion that speech is produced like writing, as a sequence of distinct vowels and consonants, may be a relic of alphabetic heritage for some language communities. In fact, the way vowels are produced depends on the surrounding consonants just as consonants are affected by surrounding vowels; this is called coarticulation. For example, in the word "kit", the [k] is farther forward than when we say 'caught'. But also, the vowel in "kick" is phonetically different from the vowel in "kit", though we normally do not hear this. In addition, there are language-specific changes which occur in casual speech which makes it quite different from spelling. For example, in English, the phrase "hit you" could often be more appropriately spelled "hitcha". From a decompositional perspective, in many cases, phonotactics play a part in letting speakers know where to draw word boundaries. In English, the word "strawberry" is perceived by speakers as consisting (phonetically) of two parts: "straw" and "berry". Other interpretations such as "stra" and "wberry" are inhibited by English phonotactics, which does not allow the cluster "wb" word-initially. Other such examples are "day/dream" and "mile/stone" which are unlikely to be interpreted as "da/ydream" or "mil/estone" due to the phonotactic probability or improbability of certain clusters. The sentence "Five women left", which could be phonetically transcribed as [faɪvwɪmɘnlɛft], is marked since neither /vw/ in /faɪvwɪmɘn/ nor /nl/ in /wɪmɘnlɛft/ are allowed as syllable onsets or codas in English phonotactics. These phonotactic cues often allow speakers to easily distinguish the boundaries in words. Vowel harmony in languages like Finnish can also serve to provide phonotactic cues. While the system does not allow front vowels and back vowels to exist together within one morpheme, compounds allow two morphemes to maintain their own vowel harmony while coexisting in a word. Therefore, in compounds such as "selkä/ongelma" ('back problem') where vowel harmony is distinct between two constituents in a compound, the boundary will be wherever the switch in harmony takes place—between the "ä" and the "ö" in this case. Still, there are instances where phonotactics may not aid in segmentation. Words with unclear clusters or uncontrasted vowel harmony as in "opinto/uudistus" ('student reform') do not offer phonotactic clues as to how they are segmented. From the perspective of the whole-word model, however, these words are thought be stored as full words, so the constituent parts would not necessarily be relevant to lexical recognition. == In infants and non-natives == Infants are one major focus of research in speech segmentation. Since infants have not yet acquired a lexicon capable of providing extensive contextual clues or probability-based word searches within their first year, as mentioned above, they must often rely primarily upon phonotactic and rhythmic cues (with prosody being the dominant cue), all

    Read more →
  • AI Code-review Tools Reviews: What Actually Works in 2026

    AI Code-review Tools Reviews: What Actually Works in 2026

    Shopping for the best AI code-review tool? An AI code-review tool is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI code-review tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Structured support vector machine

    Structured support vector machine

    The structured supportvector machine is a machine learning algorithm that generalizes the support vector machine (SVM) classifier. Whereas the SVM classifier supports binary classification, multiclass classification and regression, the structured SVM allows training of a classifier for general structured output labels. As an example, a sample instance might be a natural language sentence, and the output label is an annotated parse tree. Training a classifier consists of showing pairs of correct sample and output label pairs. After training, the structured SVM model allows one to predict for new sample instances the corresponding output label; that is, given a natural language sentence, the classifier can produce the most likely parse tree. == Training == For a set of n {\displaystyle n} training instances ( x i , y i ) ∈ X × Y {\displaystyle ({\boldsymbol {x}}_{i},y_{i})\in {\mathcal {X}}\times {\mathcal {Y}}} , i = 1 , … , n {\displaystyle i=1,\dots ,n} from a sample space X {\displaystyle {\mathcal {X}}} and label space Y {\displaystyle {\mathcal {Y}}} , the structured SVM minimizes the following regularized risk function. min w ‖ w ‖ 2 + C ∑ i = 1 n max y ∈ Y ( 0 , Δ ( y i , y ) + ⟨ w , Ψ ( x i , y ) ⟩ − ⟨ w , Ψ ( x i , y i ) ⟩ ) {\displaystyle {\underset {\boldsymbol {w}}{\min }}\quad \|{\boldsymbol {w}}\|^{2}+C\sum _{i=1}^{n}{\underset {y\in {\mathcal {Y}}}{\max }}\left(0,\Delta (y_{i},y)+\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y)\rangle -\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y_{i})\rangle \right)} The function is convex in w {\displaystyle {\boldsymbol {w}}} because the maximum of a set of affine functions is convex. The function Δ : Y × Y → R + {\displaystyle \Delta :{\mathcal {Y}}\times {\mathcal {Y}}\to \mathbb {R} _{+}} measures a distance in label space and is an arbitrary function (not necessarily a metric) satisfying Δ ( y , z ) ≥ 0 {\displaystyle \Delta (y,z)\geq 0} and Δ ( y , y ) = 0 ∀ y , z ∈ Y {\displaystyle \Delta (y,y)=0\;\;\forall y,z\in {\mathcal {Y}}} . The function Ψ : X × Y → R d {\displaystyle \Psi :{\mathcal {X}}\times {\mathcal {Y}}\to \mathbb {R} ^{d}} is a feature function, extracting some feature vector from a given sample and label. The design of this function depends very much on the application. Because the regularized risk function above is non-differentiable, it is often reformulated in terms of a quadratic program by introducing one slack variable ξ i {\displaystyle \xi _{i}} for each sample, each representing the value of the maximum. The standard structured SVM primal formulation is given as follows. min w , ξ ‖ w ‖ 2 + C ∑ i = 1 n ξ i s.t. ⟨ w , Ψ ( x i , y i ) ⟩ − ⟨ w , Ψ ( x i , y ) ⟩ + ξ i ≥ Δ ( y i , y ) , i = 1 , … , n , ∀ y ∈ Y {\displaystyle {\begin{array}{cl}{\underset {{\boldsymbol {w}},{\boldsymbol {\xi }}}{\min }}&\|{\boldsymbol {w}}\|^{2}+C\sum _{i=1}^{n}\xi _{i}\\{\textrm {s.t.}}&\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y_{i})\rangle -\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y)\rangle +\xi _{i}\geq \Delta (y_{i},y),\qquad i=1,\dots ,n,\quad \forall y\in {\mathcal {Y}}\end{array}}} == Inference == At test time, only a sample x ∈ X {\displaystyle {\boldsymbol {x}}\in {\mathcal {X}}} is known, and a prediction function f : X → Y {\displaystyle f:{\mathcal {X}}\to {\mathcal {Y}}} maps it to a predicted label from the label space Y {\displaystyle {\mathcal {Y}}} . For structured SVMs, given the vector w {\displaystyle {\boldsymbol {w}}} obtained from training, the prediction function is the following. f ( x ) = argmax y ∈ Y ⟨ w , Ψ ( x , y ) ⟩ {\displaystyle f({\boldsymbol {x}})={\underset {y\in {\mathcal {Y}}}{\textrm {argmax}}}\quad \langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}},y)\rangle } Therefore, the maximizer over the label space is the predicted label. Solving for this maximizer is the so-called inference problem and similar to making a maximum a-posteriori (MAP) prediction in probabilistic models. Depending on the structure of the function Ψ {\displaystyle \Psi } , solving for the maximizer can be a hard problem. == Separation == The above quadratic program involves a very large, possibly infinite number of linear inequality constraints. In general, the number of inequalities is too large to be optimized over explicitly. Instead the problem is solved by using delayed constraint generation where only a finite and small subset of the constraints is used. Optimizing over a subset of the constraints enlarges the feasible set and will yield a solution that provides a lower bound on the objective. To test whether the solution w {\displaystyle {\boldsymbol {w}}} violates constraints of the complete set inequalities, a separation problem needs to be solved. As the inequalities decompose over the samples, for each sample ( x i , y i ) {\displaystyle ({\boldsymbol {x}}_{i},y_{i})} the following problem needs to be solved. y n ∗ = argmax y ∈ Y ( Δ ( y i , y ) + ⟨ w , Ψ ( x i , y ) ⟩ − ⟨ w , Ψ ( x i , y i ) ⟩ − ξ i ) {\displaystyle y_{n}^{}={\underset {y\in {\mathcal {Y}}}{\textrm {argmax}}}\left(\Delta (y_{i},y)+\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y)\rangle -\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y_{i})\rangle -\xi _{i}\right)} The right hand side objective to be maximized is composed of the constant − ⟨ w , Ψ ( x i , y i ) ⟩ − ξ i {\displaystyle -\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y_{i})\rangle -\xi _{i}} and a term dependent on the variables optimized over, namely Δ ( y i , y ) + ⟨ w , Ψ ( x i , y ) ⟩ {\displaystyle \Delta (y_{i},y)+\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y)\rangle } . If the achieved right hand side objective is smaller or equal to zero, no violated constraints for this sample exist. If it is strictly larger than zero, the most violated constraint with respect to this sample has been identified. The problem is enlarged by this constraint and resolved. The process continues until no violated inequalities can be identified. If the constants are dropped from the above problem, we obtain the following problem to be solved. y i ∗ = argmax y ∈ Y ( Δ ( y i , y ) + ⟨ w , Ψ ( x i , y ) ⟩ ) {\displaystyle y_{i}^{}={\underset {y\in {\mathcal {Y}}}{\textrm {argmax}}}\left(\Delta (y_{i},y)+\langle {\boldsymbol {w}},\Psi ({\boldsymbol {x}}_{i},y)\rangle \right)} This problem looks very similar to the inference problem. The only difference is the addition of the term Δ ( y i , y ) {\displaystyle \Delta (y_{i},y)} . Most often, it is chosen such that it has a natural decomposition in label space. In that case, the influence of Δ {\displaystyle \Delta } can be encoded into the inference problem and solving for the most violating constraint is equivalent to solving the inference problem.

    Read more →
  • State complexity

    State complexity

    State complexity is an area of theoretical computer science dealing with the size of abstract automata, such as different kinds of finite automata. The classical result in the area is that simulating an n {\displaystyle n} -state nondeterministic finite automaton by a deterministic finite automaton requires exactly 2 n {\displaystyle 2^{n}} states in the worst case. == Transformation between variants of finite automata == Finite automata can be deterministic and nondeterministic, one-way (DFA, NFA) and two-way (2DFA, 2NFA). Other related classes are unambiguous (UFA), self-verifying (SVFA) and alternating (AFA) finite automata. These automata can also be two-way (2UFA, 2SVFA, 2AFA). All these machines can accept exactly the regular languages. However, the size of different types of automata necessary to accept the same language (measured in the number of their states) may be different. For any two types of finite automata, the state complexity tradeoff between them is an integer function f {\displaystyle f} where f ( n ) {\displaystyle f(n)} is the least number of states in automata of the second type sufficient to recognize every language recognized by an n {\displaystyle n} -state automaton of the first type. The following results are known. NFA to DFA: 2 n {\displaystyle 2^{n}} states. This is the subset construction by Rabin and Scott, proved optimal by Lupanov. UFA to DFA: 2 n {\displaystyle 2^{n}} states, see Leung, An earlier lower bound by Schmidt was smaller. NFA to UFA: 2 n − 1 {\displaystyle 2^{n}-1} states, see Leung. There was an earlier smaller lower bound by Schmidt. SVFA to DFA: Θ ( 3 n / 3 ) {\displaystyle \Theta (3^{n/3})} states, see Jirásková and Pighizzini 2DFA to DFA: n ( n n − ( n − 1 ) n ) {\displaystyle n(n^{n}-(n-1)^{n})} states, see Kapoutsis. Earlier construction by Shepherdson used more states, and an earlier lower bound by Moore was smaller. 2DFA to NFA: ( 2 n n + 1 ) = O ( 4 n n ) {\displaystyle {\binom {2n}{n+1}}=O({\frac {4^{n}}{\sqrt {n}}})} , see Kapoutsis. Earlier construction by Birget used more states. 2NFA to NFA: ( 2 n n + 1 ) {\displaystyle {\binom {2n}{n+1}}} , see Kapoutsis. 2NFA to NFA accepting the complement: O ( 4 n ) {\displaystyle O(4^{n})} states, see Vardi. AFA to DFA: 2 2 n {\displaystyle 2^{2^{n}}} states, see Chandra, Kozen and Stockmeyer. AFA to NFA: 2 n {\displaystyle 2^{n}} states, see Fellah, Jürgensen and Yu. 2AFA to DFA: 2 n 2 n {\displaystyle 2^{n2^{n}}} , see Ladner, Lipton and Stockmeyer. 2AFA to NFA: 2 Θ ( n log ⁡ n ) {\displaystyle 2^{\Theta (n\log n)}} , see Geffert and Okhotin. === The 2DFA vs. 2NFA problem and logarithmic space === It is an open problem whether all 2NFAs can be converted to 2DFAs with polynomially many states, i.e. whether there is a polynomial p ( n ) {\displaystyle p(n)} such that for every n {\displaystyle n} -state 2NFA there exists a p ( n ) {\displaystyle p(n)} -state 2DFA. The problem was raised by Sakoda and Sipser, who compared it to the P vs. NP problem in the computational complexity theory. Berman and Lingas discovered a formal relation between this problem and the L vs. NL open problem. This relation was further elaborated by Kapoutsis. == State complexity of operations for finite automata == Given a binary regularity-preserving operation on languages ∘ {\displaystyle \circ } and a family of automata X (DFA, NFA, etc.), the state complexity of ∘ {\displaystyle \circ } is an integer function f ( m , n ) {\displaystyle f(m,n)} such that for each m-state X-automaton A and n-state X-automaton B there is an f ( m , n ) {\displaystyle f(m,n)} -state X-automaton for L ( A ) ∘ L ( B ) {\displaystyle L(A)\circ L(B)} , and for all integers m, n there is an m-state X-automaton A and an n-state X-automaton B such that every X-automaton for L ( A ) ∘ L ( B ) {\displaystyle L(A)\circ L(B)} must have at least f ( m , n ) {\displaystyle f(m,n)} states. Analogous definition applies for operations with any number of arguments. The first results on state complexity of operations for DFAs were published by Maslov and by Yu, Zhuang and Salomaa. Holzer and Kutrib pioneered the state complexity of operations on NFA. The known results for basic operations are listed below. === Union === If language L 1 {\displaystyle L_{1}} requires m states and language L 2 {\displaystyle L_{2}} requires n states, how many states does L 1 ∪ L 2 {\displaystyle L_{1}\cup L_{2}} require? DFA: m n {\displaystyle mn} states, see Maslov and Yu, Zhuang and Salomaa. NFA: m + n + 1 {\displaystyle m+n+1} states, see Holzer and Kutrib. UFA: at least min ( n , m ) Ω ( log ⁡ ( min ( n , m ) ) ) {\displaystyle \min(n,m)^{\Omega (\log(\min(n,m)))}} ; between m n + m + n {\displaystyle mn+m+n} and m + n m 2 0.79 m {\displaystyle m+nm2^{0.79m}} states, see Jirásek, Jirásková and Šebej. SVFA: m n {\displaystyle mn} states, see Jirásek, Jirásková and Szabari. 2DFA: between m + n {\displaystyle m+n} and 4 m + n + 4 {\displaystyle 4m+n+4} states, see Kunc and Okhotin. 2NFA: m + n {\displaystyle m+n} states, see Kunc and Okhotin. === Intersection === How many states does L 1 ∩ L 2 {\displaystyle L_{1}\cap L_{2}} require? DFA: m n {\displaystyle mn} states, see Maslov and Yu, Zhuang and Salomaa. NFA: m n {\displaystyle mn} states, see Holzer and Kutrib. UFA: m n {\displaystyle mn} states, see Jirásek, Jirásková and Šebej. SVFA: m n {\displaystyle mn} states, see Jirásek, Jirásková and Szabari. 2DFA: between m + n {\displaystyle m+n} and m + n + 1 {\displaystyle m+n+1} states, see Kunc and Okhotin. 2NFA: between m + n {\displaystyle m+n} and m + n + 1 {\displaystyle m+n+1} states, see Kunc and Okhotin. === Complementation === If language L requires n states then how many states does its complement require? DFA: n {\displaystyle n} states, by exchanging accepting and rejecting states. NFA: 2 n {\displaystyle 2^{n}} states, see Birget. or Jirásková UFA: at least n Ω ~ ( log ⁡ n ) {\displaystyle n^{{\tilde {\Omega }}(\log n)}} states, see Göös, Kiefer and Yuan, (this follows an earlier bound by Raskin); and at most n + 1 ⋅ 2 0.5 n {\displaystyle {\sqrt {n+1}}\cdot 2^{0.5n}} states, see Indzhev and Kiefer. SVFA: n {\displaystyle n} states, by exchanging accepting and rejecting states. 2DFA: at least n {\displaystyle n} and at most 4 n {\displaystyle 4n} states, see Geffert, Mereghetti and Pighizzini. === Concatenation === How many states does L 1 L 2 = { w 1 w 2 ∣ w 1 ∈ L 1 , w 2 ∈ L 2 } {\displaystyle L_{1}L_{2}=\{w_{1}w_{2}\mid w_{1}\in L_{1},w_{2}\in L_{2}\}} require? DFA: m ⋅ 2 n − 2 n − 1 {\displaystyle m\cdot 2^{n}-2^{n-1}} states, see Maslov and Yu, Zhuang and Salomaa. NFA: m + n {\displaystyle m+n} states, see Holzer and Kutrib. UFA: 3 4 2 m + n − 1 {\displaystyle {\frac {3}{4}}2^{m+n}-1} states, see Jirásek, Jirásková and Šebej. SVFA: Θ ( 3 n / 3 2 m ) {\displaystyle \Theta (3^{n/3}2^{m})} states, see Jirásek, Jirásková and Szabari. 2DFA: at least 2 Ω ( n ) log ⁡ m {\displaystyle {\frac {2^{\Omega (n)}}{\log m}}} and at most 2 m m + 1 ⋅ 2 n n + 1 {\displaystyle 2m^{m+1}\cdot 2^{n^{n+1}}} states, see Jirásková and Okhotin. === Kleene star === DFA: 3 4 2 n {\displaystyle {\frac {3}{4}}2^{n}} states, see Maslov and Yu, Zhuang and Salomaa. NFA: n + 1 {\displaystyle n+1} states, see Holzer and Kutrib. UFA: 3 4 2 n {\displaystyle {\frac {3}{4}}2^{n}} states, see Jirásek, Jirásková and Šebej. SVFA: 3 4 2 n {\displaystyle {\frac {3}{4}}2^{n}} states, see Jirásek, Jirásková and Szabari. 2DFA: at least 1 n 2 n 2 − 1 {\displaystyle {\frac {1}{n}}2^{{\frac {n}{2}}-1}} and at most 2 O ( n n + 1 ) {\displaystyle 2^{O(n^{n+1})}} states, see Jirásková and Okhotin. === Reversal === DFA: 2 n {\displaystyle 2^{n}} states, see Mirkin, Leiss, and Yu, Zhuang and Salomaa. NFA: n + 1 {\displaystyle n+1} states, see Holzer and Kutrib. UFA: n {\displaystyle n} states. SVFA: 2 n + 1 {\displaystyle 2n+1} states, see Jirásek, Jirásková and Szabari. 2DFA: between n + 1 {\displaystyle n+1} and n + 2 {\displaystyle n+2} states, see Jirásková and Okhotin. == Finite automata over a unary alphabet == State complexity of finite automata with a one-letter (unary) alphabet, pioneered by Chrobak, is different from the multi-letter case. Let g ( n ) = e Θ ( n ln ⁡ n ) {\displaystyle g(n)=e^{\Theta ({\sqrt {n\ln n}})}} be Landau's function. === Transformation between models === For a one-letter alphabet, transformations between different types of finite automata are sometimes more efficient than in the general case. NFA to DFA: g ( n ) + O ( n 2 ) {\displaystyle g(n)+O(n^{2})} states, see Chrobak. 2DFA to DFA: g ( n ) + O ( n ) {\displaystyle g(n)+O(n)} states, see Chrobak and Kunc and Okhotin. 2NFA to DFA: O ( g ( n ) ) {\displaystyle O(g(n))} states, see Mereghetti and Pighizzini. and Geffert, Mereghetti and Pighizzini. NFA to 2DFA: at most O ( n 2 ) {\displaystyle O(n^{2})} states, see Chrobak. 2NFA to 2DFA: at most n O ( log ⁡ n ) {\displaystyle n^{O(\log n)}} states, proved by implementing the method of Savitch's theorem, see

    Read more →
  • Transparency in the software supply chain

    Transparency in the software supply chain

    Transparency in the software supply chain is a condition in which participants involved in the development, procurement, operation, auditing, or regulation of software can determine which components, dependencies, build stages, identifiers, and relationships within the supply chain make up the delivered product. The disclosure of information about software components, their interrelationships, origins, and development methods—for the purposes of risk management, vulnerability detection, and compliance—takes place throughout the software lifecycle. Transparency is one of the key security attributes of the software supply chain, as a deeper understanding of the chain enables participants to identify vulnerabilities and mitigate threats. Problems in the software supply chain can cause billions in losses and create operational challenges for government and commercial entities, as demonstrated by incidents involving SolarWinds, Bybit, 3CX, Jaguar Land Rover, GitHub, and NotPetya. Modern software is often assembled from third-party libraries and open-source components. According to research by the Linux Foundation and Synopsys, 96% of the commercial codebases analyzed contained open-source software, and 70–90% of a typical codebase may consist of open-source components. Without transparency, any software component can become a threat. As a result, companies may spend billions of dollars building robust external defenses, but this will not protect against vulnerabilities in legitimate software inside the perimeter. At the same time, supply chain attacks also erode trust between customers and their IT providers, as malicious code is often embedded in official updates with certificates and digital signatures. One of the primary ways to ensure transparency is through a software bill of materials, which documents the components used to create the software and the relationships within the supply chain. == Concept == The software supply chain is the collection of systems, devices, people, artifacts, and processes involved in the creation of the final software product. Attacks on the software supply chain differ from conventional attacks in that they follow a four-stage pattern: compromise, modification, distribution, and subsequent exploitation of the compromised or modified component. A defining feature of a supply chain attack is the introduction or manipulation of a change at an upstream stage, which is subsequently exploited at a downstream stage. Transparency refers to the availability of knowledge about the chain, while validity concerns the integrity of operations and artifacts and the authentication of participants, and separation involves reducing unnecessary trust relationships and the radius of impact through compartmentalization. In this framework, transparency primarily helps during the pre-compromise and detection phases, as a clearer understanding of participants, operations, and artifacts makes it easier to identify weak links before attackers exploit them. Current major attack vectors include dependencies and containers, build infrastructure, and human participants, such as maintainers or developers. == History == Software supply-chain transparency developed from earlier efforts to document software components, long before the term came into widespread use in the cybersecurity field. Early component-documentation formats included SPDX, first published in 2011, and CycloneDX, first published in 2017. Initially, these formats were created to support license compliance, package identification, and tool compatibility. Their development helped shape a broader concept of software supply chain transparency, encompassing component documentation, disclosure practices, risk management, security analysis, and regulatory compliance. In 2018, the U.S. National Telecommunications and Information Administration launched a multistakeholder process on promoting software component transparency. This process helped move work on SBOMs from a specialized technical practice into the realm of policy and procurement to identify components used in software products. The 2020 compromise of the SolarWinds Orion platform made software supply chain security a central issue in government cybersecurity policy. An analysis of the “Sunburst” campaign prepared by the Atlantic Council noted that the vulnerability of the software supply chain had become a realized risk for national-security agencies. In May 2021, U.S. President Joe Biden issued Executive Order 14028, which directed federal agencies to improve cybersecurity and increase transparency in the software supply chain, including requirements related to SBOMs. Reuters reported that the executive order required software developers selling their products to the federal government to provide greater visibility into their software and make security data available. In July 2021, the NTIA published the document “The Minimum Elements for a Software Bill of Materials (SBOM)”, defining the basic data fields and practices for creating SBOMs. Between 2021 and 2025, the U.S. Cybersecurity and Infrastructure Security Agency updated its guidance on “Framing Software Component Transparency”, expanding the set of SBOM attributes, metadata requirements, and operational recommendations for the creation, exchange, and use of SBOMs. Major incidents that occurred following the SolarWinds attack have underscored the importance of transparency in vulnerability management and supply chain security. The Log4Shell vulnerability in the Log4j library, disclosed in December 2021, demonstrated how difficult it can be for organizations to identify a vulnerable component deeply embedded within applications and services. In 2024, an attempt to plant a backdoor in XZ Utils showed how attackers could exploit trust in open-source maintenance processes to introduce malicious code into widely used infrastructure software. By the mid-2020s, software supply chain transparency had become part of international cybersecurity coordination and regulation. On September 3, 2025, Japan's Ministry of Economy, Trade and Industry and the National Cybersecurity Office, in collaboration with cybersecurity agencies from 15 countries, released the document “A Shared Vision of Software Bill of Materials (SBOM) for Cybersecurity.” In the European Union, the Cyber Resilience Act required manufacturers of products with digital elements to create, maintain, and retain SBOMs as part of the technical documentation for software placed on the EU market. == Transparency mechanisms == The primary mechanism for ensuring transparency is the software bill of materials (SBOM). An SBOM is a structured list of components, libraries, and tools used to build and distribute a software product, and it records dependencies in a way that helps organizations understand and assess their software supply chains. It can also be described as a formal record of components and their interdependencies, which gives users insight into their actual exposure to risks and threats. Five key areas of SBOM application in software supply chain security have been identified: vulnerability management, ensuring transparency, component evaluation, risk assessment, and ensuring supply chain integrity. In software supply chains, an SBOM documents all components, both open-source and proprietary. Under Executive Order 14028, U.S. federal agencies require software suppliers to provide SBOMs for government-procured software. The list of minimum required SBOM elements defined by NTIA includes three main categories: required data fields for describing each component (name, version, identifiers), automation support (machine-readable format, generation tools), and recommendations for creating SBOMs during development and purchasing. The post-2021 push for SBOMs was intended to provide visibility into the components used within software and to expose parts of an application that would otherwise remain hidden. This information can be used to prioritize patches, manage vulnerabilities, and support compliance work. Transparency also supports software traceability, which is becoming a standard feature of developer platforms. Traceability has become important because organizations are increasingly required to demonstrate how software was created, rather than simply listing its components. Higher levels of assurance require signed, tamper-proof traceability and more isolated, verifiable build environments. A related mechanism is build reproducibility. Reproducible builds are defined as build processes that make the compilation process deterministic, ensuring that the same source code always produces the same binary file. These builds are considered a foundational element for distributed verification, transparency-log maintenance, supply-chain workflow integration, and the creation of keyless signatures based on verifiable logs. Although reproducibility does not replace inventory or attestation, it gives external par

    Read more →
  • Stephen Muggleton

    Stephen Muggleton

    Stephen H. Muggleton (born 6 December 1959, son of Louis Muggleton) is Professor of Machine Learning and Head of the Computational Bioinformatics Laboratory at Imperial College London. == Education == Muggleton received his Bachelor of Science degree in computer science (1982) and Doctor of Philosophy in artificial intelligence (1986) supervised by Donald Michie at the University of Edinburgh. == Career == Following his PhD, Muggleton went on to work as a postdoctoral research associate at the Turing Institute in Glasgow (1987–1991) and later an EPSRC Advanced Research Fellow at Oxford University Computing Laboratory (OUCL) (1992–1997) where he founded the Machine Learning Group. In 1997 he moved to the University of York and in 2001 to Imperial College London. From 2025, Muggleton has joined Nanjing University as a full-time professor. == Research == Muggleton's research interests are primarily in Artificial intelligence. From 1997 to 2001 he held the Chair of Machine Learning at the University of York and from 2001 to 2006 the EPSRC Chair of Computational Bioinformatics at Imperial College in London. Since 2013 he holds the Syngenta/Royal Academy of Engineering Research Chair as well as the post of Director of Modelling for the Imperial College Centre for Integrated Systems Biology. He is known for founding the field of Inductive logic programming. In this field he has made contributions to theory introducing predicate invention, inverse entailment and stochastic logic programs. He has also played a role in systems development where he was instrumental in the systems Duce, Cigol, Golem, Progol and Metagol and applications – especially biological prediction tasks. He worked on a Robot Scientist together with Ross D. King that is capable of combining Inductive Logic Programming with active learning. His present work concentrates on the development of Meta-Interpretive Learning, a new form of Inductive Logic Programming which supports predicate invention and learning of recursive programs.

    Read more →
  • AI Video Generators: Free vs Paid (2026)

    AI Video Generators: Free vs Paid (2026)

    In search of the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →