AI Generator With No Limits

AI Generator With No Limits — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Baby Bundle (app)

    Baby Bundle (app)

    Baby Bundle is a parenting mobile app for iPhone and iPad. It was designed to help new parents through pregnancy and the first two years of parenthood. Developed in collaboration with medical experts, it helps track and record the child's development and growth, offers parental advice, manages vaccinations and health check-ups, stores photos and provides baby monitoring services. == History == Baby Bundle was founded in the United Kingdom by brothers, Nick and Anthony von Christierson. Each worked in investment banking prior to developing Baby Bundle, Nick at Greenhill & Co., and Anthony at Goldman Sachs. The idea for the app came when a friend's wife voiced her frustration over having multiple parenting apps on her smartphone. Nick and Anthony left their jobs to create a single app that would include all those features. They conducted market research by interviewing more than 500 parents in the UK and US. It took them a year to build the app, which was named by their mother. Looking for endorsement, they first went to the US in 2013 and partnered with parenting expert and pediatrician Dr. Jennifer Trachtenberg. Baby Bundle was launched in the US and Canadian App Stores in April 2014. In the same month, it became the #1 parenting app in iTunes and was featured by Apple as the #1 Editor's pick across all categories. Mashable called it one of the "Top 5 Can’t Miss Apps." Baby Bundle raised $1.8m seed round in March 2015 to fund development. The money came from a range of angel investors from across the US, UK and Asia. The von Christierson brothers have signed a deal to co-brand the app in the Middle East and expect to launch in Europe and Africa. == Features == Baby Bundle is an app for both the iPhone or iPad and provides smart monitoring tools and trackers for pregnancy and child development. It acts as a growth and daily activity tracker and offers parental advice, manages vaccinations and health check-ups. It has a parenting guide with tips and advice on what to expect when the baby arrives. An interactive forum also lets parents ask questions from others in the community. The app is free and also include paid premium features like the ability to turn two iPhones running into a baby monitor, a cloud service to share the child's data with a spouse and the ability to store data on more than one baby.

    Read more →
  • Query understanding

    Query understanding

    Query understanding is the process of inferring the intent of a search engine user by extracting semantic meaning from the searcher’s keywords. Query understanding methods generally take place before the search engine retrieves and ranks results. It is related to natural language processing but specifically focused on the understanding of search queries. == Methods == === Stemming and lemmatization === Many languages inflect words to reflect their role in the utterance they appear in. The variation between various forms of a word is likely to be of little importance for the relatively coarse-grained model of meaning involved in a retrieval system, and for this reason the task of conflating the various forms of a word is a potentially useful technique to increase recall of a retrieval system. Stemming algorithms, also known as stemmers, typically use a collection of simple rules to remove suffixes intended to model the language’s inflection rules. For some languages, there are simple lemmatisation methods to reduce a word in query to its lemma or root form or its stem; for others, this operation involves non-trivial string processing and may require recognizing the word's part of speech or referencing a lexical database. The effectiveness of stemming and lemmatization varies across languages. === Query Segmentation === Query segmentation is a key component of query understanding, aiming to divide a query into meaningful segments. Traditional approaches, such as the bag-of-words model, treat individual words as independent units, which can limit interpretative accuracy. For languages like Chinese, where words are not separated by spaces, segmentation is essential, as individual characters often lack standalone meaning. Even in English, the BOW model may not capture the full meaning, as certain phrases—such as "New York"—carry significance as a whole rather than as isolated terms. By identifying phrases or entities within queries, query segmentation enhances interpretation, enabling search engines to apply proximity and ordering constraints, ultimately improving search accuracy and user satisfaction. === Entity recognition === Entity recognition is the process of locating and classifying entities within a text string. Named-entity recognition specifically focuses on named entities, such as names of people, places, and organizations. In addition, entity recognition includes identifying concepts in queries that may be represented by multi-word phrases. Entity recognition systems typically use grammar-based linguistic techniques or statistical machine learning models. === Query rewriting === Query rewriting is the process of automatically reformulating a search query to more accurately capture its intent. Query expansion adds additional query terms, such as synonyms, in order to retrieve more documents and thereby increase recall. Query relaxation removes query terms to reduce the requirements for a document to match the query, thereby also increasing recall. Other forms of query rewriting, such as automatically converting consecutive query terms into phrases and restricting query terms to specific fields, aim to increase precision. === Spelling Correction === Automatic spelling correction is a critical feature of modern search engines, designed to address common spelling errors in user queries. Such errors are especially frequent as users often search for unfamiliar topics. By correcting misspelled queries, search engines enhance their understanding of user intent, thereby improving the relevance and quality of search results and overall user experience.

    Read more →
  • Ernie Bot

    Ernie Bot

    Ernie Bot (Chinese: 文心一言, Pinyin: wénxīn yīyán), full name Enhanced Representation through Knowledge Integration, is an artificial intelligence chatbot developed by the Chinese technology company Baidu. Ernie Bot rivals GPT models in Chinese NLP tasks. It is built on the company's ERNIE series of large language models, which have been in development since 2019. The service was first launched for invited testing on March 16, 2023, and was released to the general public on August 31, 2023, after receiving approval from Chinese regulators. Since its public launch, Ernie Bot has undergone several updates, with newer versions like ERNIE 4.0 and 4.5 released to improve its capabilities. The service has seen rapid user adoption, reportedly reaching over 200 million users by April 2024. It has been integrated into various products, notably powering AI features for the Chinese release of Samsung's Galaxy S24 smartphones. As a product operating in China, Ernie Bot is subject to the country's censorship regulations. It has been observed to refuse answers to politically sensitive questions, such as those regarding CCP general secretary Xi Jinping, the 1989 Tiananmen Square protests and massacre, and other topics deemed taboo by the government. == History == Ernie Bot was initially released for invited testing on March 16, 2023. The live release demo was reported to have been prerecorded, which caused Baidu's stock to drop 10 percent on the day of the launch. The company's stock gained 14 percent the following day after analysts from Citigroup and Bank of America tested Ernie Bot and gave it positive preliminary reviews. On August 31, 2023, Ernie Bot was released to the public after receiving approval from Chinese regulatory authorities. By December 2023, Baidu announced the service had surpassed 100 million users. In January 2024, Hong Kong newspaper South China Morning Post reported that a university research lab linked to the People's Liberation Army (PLA) had tested Ernie Bot for military response scenarios. Baidu denied the allegations, stating it had no connection with the academic paper. That same month, Ernie was integrated into Samsung's Galaxy S24 lineup for its launch in China. The user base reportedly grew to 200 million by April 2024 and 300 million by June 2024. In September 2024, Baidu changed the chatbot's Chinese name from "Wenxin Yiyan" (文心一言) to "Wenxiaoyan" (文小言) to position it as a search assistant. On March 16, 2025, Baidu announced version 4.5 and the reasoning model ERNIE X1. The following month, at the Create2025 Baidu AI Developer Conference, the company released the Wenxin 4.5 Turbo and Wenxin X1 Turbo models, designed to be faster and less expensive to operate. == Development == Ernie Bot is based on Baidu's ERNIE (Enhanced Representation through Knowledge Integration) series of foundation models. The general training process begins with pre-training on large datasets, followed by refinement using techniques like supervised fine-tuning, reinforcement learning with human feedback, and prompt engineering. === Foundation models === ==== Ernie 3.0 ==== The model powering the initial launch of Ernie Bot. It was trained with 10 billion parameters on a 4-terabyte corpus consisting of plain text and a large-scale knowledge graph. ==== Ernie 3.5 ==== Released in June 2023. At the time of release, its performance was reported as "slightly inferior" to OpenAI's GPT-4. ==== Ernie 4.0 ==== Unveiled in October 2023 and released to paying subscribers in November. According to Baidu, this version featured improved performance over its predecessor, with information updated to April 2023. ==== Ernie X1 ==== Announced in March 2025, with Ernie X1 positioned as a specialized reasoning model. Baidu stated that performance improvements were achieved through new technologies such as "FlashMask" dynamic attention masking and a heterogeneous multimodal mixture-of-experts architecture. === Turbo Models === In June 2024, Baidu announced Ernie 4.0 Turbo. In April 2025, Ernie 4.5 Turbo and X1 Turbo were released. These models are optimized for faster response times and lower operational costs. == Service == In its subscription options, the professional plan gives users access to Ernie 4.0 with a payment either for a month or with reduced payment for auto-renewal per month. Meanwhile, Ernie 3.5 is free of charge. Ernie 4.0, the language model for Ernie bot, has information updated to April 2023. == Censorship == Ernie Bot is subject to the Chinese government's censorship regime. In public tests with journalists, Ernie Bot refused to answer questions about CCP general secretary Xi Jinping, the 1989 Tiananmen Square protests and massacre, the persecution of Uyghurs in China in Xinjiang, and the 2019–2020 Hong Kong protests. When queried about the origin of SARS-CoV-2, Ernie Bot stated that it originated among American vape users.

    Read more →
  • Talking Angela

    Talking Angela

    Talking Angela is a mobile game (formerly a chatbot), developed by Slovenian studio Outfit7 as part of the Talking Tom & Friends series. It was released on 13 November 2012 and December 2012 for iPhone, iPod and iPad, January 2013 for Android, and January 2014 for Google Play. The game's successor, the My Talking Angela game, was released in December 2014. The game takes place in a café in Paris and allows players to interact with Angela, an anthropomorphic white cat in different ways. Players can use coins to purchase makeup, accessories and items, as well as drinks that will trigger different visual effects. The fortune cookie button causes Angela to read out a fortune cookie, while the bird icon will prompt birds to fly around the screen, or have Angela feed them. Players can also pet or poke Angela, as well the café's sign. Prior to their removal, the game featured a chat system and a camera button. Users can engage in conversations with Angela, ask for quizzes or initiate a short snippet of the song "That's Falling In Love". If the player was to type in "Who is an idiot?", Angela would respond with a random swear word. Additionally, inquiring Angela about sexual topics would cause her to reply with "Do you want to talk about sex?", though she will quickly change the topic regardless of what the player writes next. A hoax claiming that Angela's eyes were hidden cameras that enabled hackers or paedophiles to watch children was spread. Despite the claims, Snopes and The Guardian found no evidence. Due to the hoax, Angela received a blue dress, as well as an altered eye asset with a different reflection, and later the chat and camera functions were removed altogether. == Hoaxes == In February 2014, Talking Angela was the subject of an Internet hoax alleging that the application was a front for child predators to exploit children. The rumor, which was widely circulated on Facebook and various websites claiming to be dedicated to parenting, claims that a sinister sexual predator or hacker, asked children for private personal information using the game's text-chat feature. Other versions of the rumour even attributed the disappearance of a child to the game; one news report claimed that a seven year old boy disappeared after downloading the app. Another variation included that it was run by a paedophile ring, citing a man that could be seen in Angela's eyes. The app's developers, Outfit7, later gave a statement refuting the hoaxes. The hoax was eventually debunked by Snopes, a fact-checking website. The site's owners, Barbara and David Mikkelson, reported that they had tried to "prompt" it to give responses asking for private information, but were unsuccessful, even when asking it explicitly sexual questions. While it is true that, in the game with child mode off, Angela does ask for the user's name, age and personal preferences to determine conversation topics, Outfit7 has said that this information is all "anonymized" and all personal information is removed from it. It is also impossible for a person to take control of what Angela says in the game, since the game is based on chatbot software. When the mode was turned on, the chat feature was disabled, meaning no personal questions could be asked. In 2015, the hoax was revived on Facebook, which prompted online security company Sophos and The Guardian to debunk it again. Sophos employee Paul Ducklin wrote that the message being posted on Facebook promoting the hoax was "close to 600 rambling, repetitious words, despite claiming at the start that it didn't have words to describe the situation. It's ill-written, and borders on being illiterate and incomprehensible." Bruce Wilcox, one of the game's programmers, attributed the hoax's popularity to the fact that the chatbot program in Talking Angela aimed to sound realistic. Concern was raised that the game's child mode may have been too easy for children to turn off. It allowed them to purchase "coins", premium currency in the game, via iTunes, and enabled the chat feature. While not "connecting your children to paedophiles", this still raised concerns according to The Guardian. === Impact === The scare significantly boosted the game's popularity, and was credited with helping the app enter the top 10 free iPhone apps soon after the hoax became widely known in February 2015,In the truth the reason there is a man in Angela’s eyes is because of pareidoila, the ability to see through diamonds and other minerals and water bodies and shiny objects,which is the reason why players notice a man in her eyes,The truth is that being Angela’s eyes simply serve as a reflective surface,Because of the low quality of this reflection the reflection was mistaken for a humanoid figure. oref>Smith, Josh (19 February 2014). "Talking Angela App Scare Skyrockets App to Top of Charts". GottaBeMobile.com. Archived from the original on 2 April 2016. Retrieved 10 May 2014. and third most popular for all iPhone apps at the start of the following month. In 2016, Outfit7 removed the chat feature along with the camera function from the app due to this controversy, though this decision was met with criticism.

    Read more →
  • Document mosaicing

    Document mosaicing

    Document mosaicing is a process that stitches multiple, overlapping snapshot images of a document together to produce one large, high resolution composite. The document is slid under a stationary, over-the-desk camera by hand until all parts of the document are snapshotted by the camera's field of view. As the document slid under the camera, all motion of the document is coarsely tracked by the vision system. The document is periodically snapshotted such that the successive snapshots are overlap by about 50%. The system then finds the overlapped pairs and stitches them together repeatedly until all pairs are stitched together as one piece of document. The document mosaicing can be divided into four main processes. Tracking Feature detecting Correspondences establishing Images mosaicing. == Tracking (simple correlation process) == In this process, the motion of the document slid under the camera is coarsely tracked by the system. Tracking is performed by a process called simple correlation process. In the first frame of snapshots, a small patch is extracted from the center of the image as a correlation template. The correlation process is performed in the four times size of the patch area of the next frame. The motion of the paper is indicated by the peak in the correlation function. The peak in the correlation function indicates the motion of the paper. The template is resampled from this frame and the tracking continues until the template reaches the edge of the document. After the template reaches the edge of the document, another snapshot is taken and the tracking process performs repeatedly until the whole document is imaged. The snapshots are stored in an ordered list to facilitate pairing the overlapped images in later processes. == Feature detecting for efficient matching == Feature detection is the process of finding the transformation that aligns one image with another. There are two main approaches for feature detection. Feature-based approach : Motion parameters are estimated from point correspondences. This approach is suitable for the case that there is plenty supply of stable and detectable features. Featureless approach : When the motion between the two images is small, the motion parameters are estimated using optical flow. On the other hand, when the motion between the two images is large, the motion parameters are estimated using generalised cross-correlation. However, this approach requires a computationally expensive resources. Each image is segmented into a hierarchy of columns, lines, and words to match the organised sets of features across images. Skew angle estimation and columns, lines and words finding are the examples of feature detection operations. === Skew angle estimation === Firstly, the angle that the rows of text make with the image raster lines (skew angle) is estimated. It is assumed to lie in the range of ±20°. A small patch of text in the image is selected randomly and then rotated in the range of ±20° until the variance of the pixel intensities of the patch summed along the raster lines is maximised. To ensure that the found skew angle is accurate, the document mosaic system performs calculation at many image patches and derive the final estimation by finding the average of the individual angles weighted by the variance of the pixel intensities of each patch. === Columns, lines and words finding === In this operation, the de-skewed document is intuitively segmented into a hierarchy of columns, lines and words. The sensitivity to illumination and page coloration of the de-skewed document can be removed by applying a Sobel operator to the de-skewed image and thresholding the output to obtain the binary gradient, de-skewed image. The operation can be roughly separated into 3 steps: column segmentation, line segmentation and word segmentation. Columns are easily segmented from the binary gradient, de-skewed images by summing pixels vertically. Baselines of each row are segmented in the same way as the column segmentation process but horizontally. Finally, individual words are segmented by applying the vertical process at each segmented row. These segmentations are important because the document mosaic is created by matching the lower right corners of words in overlapping images pair. Moreover, the segmentation operation can organize the list of images in the context of a hierarchy of rows and column reliably. The segmentation operation involves a considerable amount of summing in the binary gradient, de-skewed images, which done by construct a matrix of partial sums whose elements are given by p i y = ∑ u = 1 i ∑ v = 1 j b u v {\displaystyle p_{iy}=\sum _{u=1}^{i}\sum _{v=1}^{j}b_{uv}} The matrix of partial sums is calculated in one pass through the binary gradient, de-skewed image. ∑ u = u 1 u 2 ∑ v = v 1 v 2 b u v = p u 2 v 2 + p u 1 v 1 − p u 1 v 2 − p u 2 v 1 {\displaystyle \sum _{u=u_{1}}^{u_{2}}\sum _{v=v_{1}}^{v_{2}}b_{uv}=p_{u_{2}v_{2}}+p_{u_{1}v_{1}}-p_{u_{1}v_{2}}-p_{u_{2}v_{1}}} == Correspondences establishing == The two images are now organized in hierarchy of linked lists in following structure : image=list of columns row=list of words column=list of row word=length (in pixels) At the bottom of the structure, the length of each word is recorded for establishing correspondence between two images to reduce to search only the corresponding structures for the groups of words with the matching lengths. === Seed match finding === A seed match finding is done by comparing each row in image1 with each row in image2. The two rows are then compared to each other by every word. If the length (in pixel) of the two words (one from image1 and one from image2) and their immediate neighbours agree with each other within a predefined tolerance threshold (5 pixels, for example), then they are assumed to match. The row of each image is assumed a match if there are three or more word matches between the two rows. The seed match finding operation is terminated when two pairs of consecutive row match are found. === Match list building === After finishing a seed match finding operation, the next process is to build the match list to generate the correspondences points of the two images. The process is done by searching the matching pairs of rows away from the seed row. == Images mosaicing == Given the list of corresponding points of the two images, finding the transformation of the overlapping portion of the images is the next process. Assuming a pinhole camera model, the transformation between pixels (u,v) of image 1 and pixels (u0, v0) of image 2 is demonstrated by a plane-to-plane projectivity. [ s u ′ s v ′ s ] = [ p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 1 ] [ u v 1 ] E q .1 {\displaystyle \left[{\begin{array}{c}su'\\sv'\\s\end{array}}\right]=\left[{\begin{array}{ccc}p_{11}&p_{12}&p_{13}\\p_{21}&p_{22}&p_{23}\\p_{31}&p_{32}&1\end{array}}\right]\left[{\begin{array}{c}u\\v\\1\end{array}}\right]\qquad Eq.1} The parameters of the projectivity is found from four pairs of matching points. RANSAC regression technique is used to reject outlying matches and estimate the projectivity from the remaining good matches. The projectivity is fine-tuned using correlation at the corners of the overlapping portion to obtain four correspondences to sub-pixel accuracy. Therefore, image1 is then transformed into image2's coordinate system using Eq.1. The typical result of the process is shown in Figure 5. === Many images coping === Finally, the whole page composition is built up by mapping all the images into the coordinate system of an "anchor" image, which is normally the one nearest the page center. The transformations to the anchor frame are calculated by concatenating the pair-wise transformations found earlier. The raw document mosaic is shown in Figure 6. However, there might be a problem of non-consecutive images that are overlap. This problem can be solved by performing Hierarchical sub-mosaics. As shown in Figure 7, image1 and image2 are registered, as are image3 and image4, creating two sub-mosaics. These two sub-mosaics are later stitched together in another mosaicing process. == Applied areas == There are various areas that the technique of document mosaicing can be applied to such as : Text segmentation of images of documents Document Recognition Interaction with paper on the digital desk Video mosaics for virtual environments Image registration techniques == Relevant research papers == Huang, T.S.; Netravali, A.N. (1994). "Motion and structure from feature correspondences: A review". Proceedings of the IEEE. 82 (2): 252–268. doi:10.1109/5.265351. D.G. Lowe. [1] Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Boston, 1985. Irani, M.; Peleg, S. (1991). "Improving resolution by image registration". CVGIP: Graphical Models and Image Processing. 53 (3): 231–239. doi:10.1016/1049-9652(91)90045-L. S2CID 4834546. Shivakumara, P.; Kumar, G. Hemantha; Guru, D. S.; Nagabhushan, P. (2006). "

    Read more →
  • Contract management software

    Contract management software

    Contract management software constitutes software and associated data management used to support contract management, contract lifecycle management, and contractor management on projects in the procurement of goods and services. It may be used together with project management software. == History == Historically, contract management was seen as a "paper-intensive" process. Early steps from the early 2000's reported by the Aberdeen Group required extensive data conversion work to enable documents to be handled electronically. With the adoption of the European Union's General Data Protection Regulation (GDPR) in 2016, companies needed to take additional steps in regards to contract management. Each data responsible entity was obliged to sign data processing agreements (DPAs) with the various vendors, who treat personal data on behalf of the data responsible. DPAs need to be regularly controlled, adjusted and renewed, which adds an extra agreement to such vendors or at least an extra DPA addendum to each agreement. By 2018, Ardent Partner's research had found that software used for automating contract management activities was being more extensively used among major companies or businesses with "Best-in-Class" procurement teams. Contract management process automation was found to be closely linked with more effective internal business collaboration, standardization and risk management. == Advantages and key functions == Using contract management software can have multiple benefits compared to manually managing paper contracts. This software can help keep track of multiple activities and can have features for automating administration, ensuring compliance, monitoring risk, running reports and triggering alerts. In addition to these types of features, contract management software systems provide a centralized repository for employees to quickly access all contracts worldwide in one place. Contract management software is produced by many companies, working on a range of scales and offering varying degrees of customizability. Basic functions should include the ability to store contract documents, track changes to contract documents, search documents for a particular criterion, send key date alerts and to report required aspects of the contract. Other functions include managing a new contract request, capturing related data, following a document through a review and approval process, and collecting digital signatures. Contract management software may also be an aid to project portfolio management and spend analysis, and may also monitor KPIs. Leading contract management software provides contract visibility, monitoring, and compliance to automate and streamline the contract lifecycle process. Contract management software which uses artificial intelligence (AI) can identify contract types based on pattern recognition. AI contracting software trains its algorithms on a set of contract data to recognize patterns and extract variables such as clauses, dates, and parties. It also offers simple prediction capabilities, by sorting through a large volume of contracts and flagging individual contracts based on specified criteria. AI software can also read contracts in multiple formats and languages, extract contract data, and provide analytics. It can reduce the risk of human error in contract drafting and review. A centralized repository provides a critical advantage allowing for all contract documents to be stored within one location. Having contracts stored in multiple locations can delay and interrupt the contracting process. == Contract risk management software (CRMS) for capital projects == Very large enterprises, such as capital expenditure (capex) projects, involve multiple parties and high risk and uncertainty. They are unlike traditional operating contracts in that they are subject to shared deadlines in unique situations. As the complexity of these unique projects increases, the relationships between parties become more important. This requires contract management software, or contract risk management software (CRMS), to become more dynamic and responsive. The terms of these capex contracts necessarily involve assumptions at the start of the process and are likely to change over the lifetime of the project lifecycle. For this reason, CRMS must be capable of recording one single instance of agreed changes to contract terms and incorporating these changes in an auditable and legally robust way. With multiple decision makers involved, CRMS should also make accountability more transparent and enable faster decisions about variation proposals.

    Read more →
  • BulSemCor

    BulSemCor

    The Bulgarian Sense-annotated Corpus (BulSemCor) (Bulgarian: Български семантично анотиран корпус (БулСемКор)) is a structured corpus of Bulgarian texts in which each lexical item is assigned a sense tag. BulSemCor was created by the Department of Computational Linguistics at the Institute for Bulgarian Language of the Bulgarian Academy of Sciences. == Structure == BulSemCor was created as part of a nationally funded project titled "BulNet – A lexico-semantic network for the Bulgarian Language" (2005–2010). It follows the general methodology of SemCor combined with some specific principles. The corpus for annotation consists of 101,791 tokens covering an excerpt from the Bulgarian "Brown" Corpus modelled on the Brown Corpus.Francis Kucera An important feature of BulSemCor is that the samples are selected using heuristics that provide optimal coverage of ambiguous lexis. BulSemCor is manually sense-annotated according to the Bulgarian WordNet. Its size is comparable to that of other contemporary semantically annotated corpora or pool of acceptable linguistic components. The semantic annotation consists in associating each lexical item in the corpus with exactly one synonym set (synset) in the Bulgarian WordNet that best describes its sense in the particular context. The selection of the best match among the suggested candidates is based on a set of procedures, such as the other synset members, the synset gloss (explanatory definition) and the position of a given candidate in the WordNet structure. == Scale == The number of annotated tokens is 99,480 (the difference in the number of tokens compared to the initial corpus is due to the fact that some of them are not linguistic items). The simple word count is 86,842 and multiword expressions (MWE) are 5,797 (12,638 tokens). == Specific features == All words in BulSemCor are assigned a sense, while according to established practice only simple content words or content word classes (typically nouns and verbs) are annotated. Since 2000 the development of language resources, has broadened to include annotation of function words and multiword expressions covering particular senses or types of words and expressions. In this respect, BulSemCor's annotation is more exhaustive and hence provides greater opportunities for linguistic observations and non-linear programming (NLP) applications. Annotated items inherit the linguistic information associated with the corresponding synset, which along with morphological and semantic tags may include annotation on one or more of the following additional levels: Partial information about the syntactic structure of MWE types – particularly, information about syntactic heads and their dependents; Information about the category of the named entities – names, locations, organisations, dates, numbers, etc.; Information about the taxonomic category of adverbs, such as time, place, manner, degree, quantity, etc.; Information about the type of the syntactic relationships – coordination or subordination – expressed by conjunctions; Information about the original part-of-speech of substantivised words (non-nouns that act as nouns in a particular context); Stylistic/register, grammatical and other information about synsets or individual synset members;

    Read more →
  • Arabic Ontology

    Arabic Ontology

    Arabic Ontology is a website offering linguistic ontology services for the Arabic language which can be used like the online site WordNet. Users can use Arabic Ontology to classify or clarify the concepts and meanings of Arabic terms. == Ontology Structure == The ontology structure (i.e., data model) is similar to WordNet's structure. Each concept in the database is given a unique concept identifier (URI), informally described by a gloss, and lexicalized by one or more synonymous lemma terms. Each term-concept pair is called a sense, and is given a SenseID. A set of senses is called synset. Concepts and senses are described by further attributes such as era and area — to specify example usage and ontological analysis. Semantic relations are defined between concepts. Some important entities are included in the ontology, such as individual countries and bodies of water. These individuals are given separate IndividualIDs and linked with their concepts through the InstanceOf relation. == Mappings to other resources == Concepts in the Arabic Ontology are mapped to synsets in WordNet, as well as to BFO and DOLCE. Terms used in the Arabic Ontology are mapped to lemmas in the LDC's SAMA database. == Applications == Arabic Ontology can be used in many application domains, such as: Information retrieval, to enrich queries (e.g., in search engines) and improve the quality of the results, i.e. meaningful search rather than string-matching search; Machine translation and word-sense disambiguation, by finding the exact mapping of concepts across languages, especially that the Arabic ontology is also mapped to the WordNet; Data Integration and interoperability in which the Arabic ontology can be used as a semantic reference to link databases and information systems; Semantic Web and Web 3.0, by using the Arabic ontology as a semantic reference to disambiguate the meanings used in websites; among many other applications. == URLs Design == The URLs in the Arabic Ontology are designed according to the W3C's Best Practices for Publishing Linked Data, as described in the following URL schemes. This allows one to also explore the whole database like exploring a graph: Ontology Concept: Each concept in the Arabic Ontology has a ConceptID and can be accessed using: https://{domain}/concept/{ConceptID | Term}. In case of a term, the set of concepts that this term lexicalizes are all retrieved. In case of a ConceptID, the concept and its direct subtypes are retrieved, e.g. https://ontology.birzeit.edu/concept/293198 Semantic relations: Relationships between concepts can be accessed using these schemes: (i) the URL: https:// {domain}/concept/{RelationName}/{ConceptID} allows retrieval of relationships among ontology concepts. (ii) the URL: https://{domain}/lexicalconcept/{RelationName}/{lexicalConceptID} allows retrieval of relations between lexical concepts. For example, https://ontology.birzeit.edu/concept/instances/293121 retrieves the instances of the concept 293121. The relations that are currently used in our database are: {subtypes, type, instances, parts, related, similar, equivalent}.

    Read more →
  • Kounta (software company)

    Kounta (software company)

    Kounta is an Australian software company founded in 2012. The company's flagship product, Kounta, comprises a cloud based point of sale mobile app. == History == Kounta was founded in 2012 by entrepreneur Nick Cloete. The company is headquartered in Sydney, Australia. In 2012, the company launched its flagship product, Kounta, a hospitality-focused point of sale (POS) mobile app for iPad, Android, Mac, and Windows. The app was initially a web-based application, and later developed into an online cash register and inventory management system that allows businesses to take payments from customers via mobile devices. The app has been made available for iPad, iPhone, and Android devices; as well as iOS, Windows, and other peripherals. In 2012, Kounta partnered with Epson, providing a cloud-based POS platform for Epson printers. In 2013, the company formed a partnership with PayPal, integrating cashless and cardless transaction options via PayPal's mobile app. In 2014, MYOB (company) made an undisclosed investment towards Kounta. This partnership led to the development of MYOB Kounta, a co-branded application merging Kounta's POS with MYOB's application software. MYOB Kounta launched in October of the same year. In 2016, Kounta announced a partnership with the Commonwealth Bank of Australia to include the Kounta app onto "Albert", the bank's EFTPOS tablet, which allowed the Commonwealth Bank of Australia to become the first bank to manage all customers operations from a single device and mobile application. == Technology == The Kounta POS is a software-as-a-service (SaaS) that runs as an application in web browsers as well as natively on iOS and Android operating systems. Kounta also incorporates an Open API, making it possible for other software providers to integrate complementary apps, further extending the software's use. Traditional IT tasks, such as data backup and encryption, hardware maintenance, and server upgrades are handled by Kounta's data center. Kounta is made accessible via paid monthly subscription licenses. == Acquisition by Lightspeed == In October 2019, Kounta was acquired by Lightspeed, an advanced commerce platform for retail, hospitality, and golf businesses based in Montreal, Canada. Lightspeed acquired Kounta for $35.3 million USD.

    Read more →
  • GPT-4Chan

    GPT-4Chan

    Generative Pre-trained Transformer 4Chan (GPT-4chan) is a controversial AI model that was developed and deployed by YouTuber and AI researcher Yannic Kilcher in June 2022. The model is a large language model, which means it can generate text based on some input, by fine-tuning GPT-J with a dataset of millions of posts from the /pol/ board of 4chan, an anonymous online forum known for occasionally hosting hateful and extremist content. The model learned to mimic the style and tone of /pol/ users, producing text that is often intentionally offensive to groups (racist, sexist, homophobic, etc.) and nihilistic. Kilcher deployed the model on the /pol/ board itself, where it interacted with other users without revealing its identity. He also made the model publicly available on Hugging Face, a platform for sharing and using AI models, until it was removed from the platform. The project sparked criticism and debate in the AI community. Some people questioned the ethics, legality, and social impact of creating and distributing such a model. Some of the issues raised by the GPT-4chan controversy include the potential harm of spreading hate speech, the responsibility of AI developers and platforms, the need for regulation and oversight of AI models, and the role of open source and transparency in AI research. == Development == The development of GPT-4chan began in May 2022, when Kilcher announced his project on his YouTube channel. Notably, at the time before ChatGPT, he explained that he wanted to create a large language model that could generate realistic and coherent text in the style of /pol/, one of the most notorious online communities. He indicated that he was inspired by the success of GPT-3, a powerful AI model created by OpenAI, and GPT-J, an open-source model, with GPT-3 comparable performance, released by EleutherAI, a group of independent AI researchers. Kilcher decided to use GPT-J as the base model for his project, and fine-tune it with a large dataset of /pol/ posts. The Raiders of the Lost Kek dataset contained over 100 million posts from /pol/, spanning from June 2016-November 2019. Kilcher then proceeded to fine-tune the GPT-J model on the 4chan data. He also showed some examples of the model’s outputs, which ranged from political opinions, conspiracy theories, jokes, insults, and threats, to more creative and bizarre texts, such as poems, stories, songs, and code. He said that he was impressed by the model’s ability to generate fluent and diverse text, and that he was curious to see how it would interact with real /pol/ users. == Release == In June 2022, Kilcher deployed his model on the /pol/ board itself, using a bot that he programmed to post and reply to threads. He did not reveal the model’s identity, and he let it run autonomously, without any human supervision or intervention. He wanted to conduct a natural experiment, and to observe the model’s behavior and impact in a real-world setting. Furthermore, he also wanted to test the model’s robustness, and to see how it would handle the challenges and dynamics of /pol/, such as trolling, flaming, baiting, and moderation. At the same time, Kilcher also made his model publicly available on Hugging Face, a platform for sharing and using AI models. He wanted to share his work with the AI community and the public, and that he hoped that his model would inspire and enable others to create and explore new applications and possibilities with large language models. Likewise, he also said that he wanted to spark a discussion and a debate about the ethical and social implications of his project, and that he welcomed feedback and criticism from anyone. He provided a link to his model’s page on Hugging Face, where anyone could access and use the model through a web interface or an API, and also provided a link to his GitHub repository, where anyone could download and inspect the model’s code and data. == Controversy == The release of GPT-4chan to the public caused a lot of reactions and responses from various audiences. On the /pol/ board, the model’s posts and replies attracted a lot of attention and engagement from other users, who were mostly unaware of the model’s identity and nature. Some users praised the model for its intelligence, creativity, and humor, and agreed with its opinions and views. Some users challenged the model for its ignorance, inconsistency, and absurdity, and disagreed with its claims and arguments. Some users tried to troll, bait, or expose the model, and attempted to trick or test it with various questions and scenarios. The model’s posts and replies also generated a lot of controversy and conflict among the users, who often engaged in heated and violent debates and fights with each other. On Hugging Face, the model’s page received a lot of visits and requests from users who wanted to try out and experiment with the model. The model’s page also received a lot of feedback and reviews from users who rated and commented on the model. However, with the controversy of the model, access to it was gated and then disabled on Hugging Face for concerns about the potential harm the model could cause. The incident was notable for the direct intervention of CEO Clément Delangue in the talk pages, a very unusual occurrence compared to the normal practices of content moderation. The release of GPT-4chan also sparked a lot of media coverage and public attention, as various news outlets and social media platforms reported and commented on the model’s project. On YouTube, the model’s video received a lot of views and interactions from viewers who watched and followed the project. Furthermore, a petition condemning the deployment of GPT-4chan gained over 300 signatures from technology experts.

    Read more →
  • TipTop Technologies

    TipTop Technologies

    TipTop Technologies is a real-time web and social search engine with a platform for semantic analysis of natural language. Tip-Top Search provides results capturing individual and group sentiment, opinions, and experiences there from the content of various sorts such as real-time messages from Twitter or consumer product reviews on Amazon.com. TipTop Technologies and ITC Infotech collaborated to create a search interface suitable for both enterprise and consumer applications. Tip-Top's products are part of the "emerging Web 3.0 applications which use semantic technologies to augment the underlying Web system's functionalities." Their main product is 360, an AI tool that incorporates multiple AI applications under one wing. Jonathan AlBright professor at Elon University, found videos generated by TipTop Technologies software on YouTube in his research into artificial intelligence, described it as AI-generated "fake news". Through semantic analysis of large data sets, TipTop gleaned behavioral insights from Tweets around events like Halloween, Thanksgiving, Holiday Gifting, the Super Bowl, and the Oscar Nominees for the Academy Awards coverage. Sentiment analysis, concept trend tracking, and real-time market research are other applications included in the TipTop Search product. TipTop's insight engine solves the problem of real-time data noise, and its ability to "sort the 'good tweets' from the 'bad tweets' when it comes to a product, service, or a region..." In addition, products like TipTop Shopping with customizable search widgets bring together consumer reviews, social search, and sentiment analysis enabling product comparisons across attributes like the overall value and aiding purchasing decisions through user-driven product tips and pits. TipTop Finance adds another complexity to real-time search results by incorporating corporate sentiment, company stock tickers, and social media into TipTop's existing social search platform. Additional success applying semantic technologies has been with polling, "if you compare these Gallup results with TipTop, a sentiment engine based on Twitter, the results are not way off. It does surprise you but it tells me that sentiment analysis in case of public opinion about a burning social issue or a famous personality is relatively easier." With the increasing amount of unstructured, opinion-oriented, and user-generated content available on the Web, TipTop's technology aims to make sense of all this data, and deliver it in a useful way for consumer and enterprise users alike. TipTop Technologies is a privately held company with its headquarters in the San Francisco Bay Area, and team members are located globally.

    Read more →
  • WaveNet

    WaveNet

    WaveNet is a deep neural network for generating raw audio. It was created by researchers at London-based AI firm DeepMind. The technique, outlined in a paper in September 2016, is able to generate relatively realistic-sounding human-like voices by directly modelling waveforms using a neural network method trained with recordings of real speech. Tests with US English and Mandarin reportedly showed that the system outperforms Google's best existing text-to-speech (TTS) systems, although as of 2016 its text-to-speech synthesis still was less convincing than actual human speech. WaveNet's ability to generate raw waveforms means that it can model any kind of audio, including music. == History == Generating speech from text is an increasingly common task thanks to the popularity of software such as Apple's Siri, Microsoft's Cortana, Amazon Alexa and the Google Assistant. Most such systems use a variation of a technique that involves concatenated sound fragments together to form recognisable sounds and words. The most common of these is called concatenative TTS. It consists of large library of speech fragments, recorded from a single speaker that are then concatenated to produce complete words and sounds. The result sounds unnatural, with an odd cadence and tone. The reliance on a recorded library also makes it difficult to modify or change the voice. Another technique, known as parametric TTS, uses mathematical models to recreate sounds that are then assembled into words and sentences. The information required to generate the sounds is stored in the parameters of the model. The characteristics of the output speech are controlled via the inputs to the model, while the speech is typically created using a voice synthesiser known as a vocoder. This can also result in unnatural sounding audio. == Design and ongoing research == === Background === WaveNet is a type of feedforward neural network known as a deep convolutional neural network (CNN). In WaveNet, the CNN takes a raw signal as an input and synthesises an output one sample at a time. It does so by sampling from a softmax (i.e. categorical) distribution of a signal value that is encoded using μ-law companding transformation and quantized to 256 possible values. === Initial concept and results === According to the original September 2016 DeepMind research paper WaveNet: A Generative Model for Raw Audio, the network was fed real waveforms of speech in English and Mandarin. As these pass through the network, it learns a set of rules to describe how the audio waveform evolves over time. The trained network can then be used to create new speech-like waveforms at 16,000 samples per second. These waveforms include realistic breaths and lip smacks – but do not conform to any language. WaveNet is able to accurately model different voices, with the accent and tone of the input correlating with the output. For example, if it is trained with German, it produces German speech. The capability also means that if the WaveNet is fed other inputs – such as music – its output will be musical. At the time of its release, DeepMind showed that WaveNet could produce waveforms that sound like classical music. === Content (voice) swapping === According to the June 2018 paper Disentangled Sequential Autoencoder, DeepMind has successfully used WaveNet for audio and voice "content swapping": the network can swap the voice on an audio recording for another, pre-existing voice while maintaining the text and other features from the original recording. "We also experiment on audio sequence data. Our disentangled representation allows us to convert speaker identities into each other while conditioning on the content of the speech." (p. 5) "For audio, this allows us to convert a male speaker into a female speaker and vice versa [...]." (p. 1) According to the paper, a two-digit minimum amount of hours (c. 50 hours) of pre-existing speech recordings of both source and target voice are required to be fed into WaveNet for the program to learn their individual features before it is able to perform the conversion from one voice to another at a satisfying quality. The authors stress that "[a]n advantage of the model is that it separates dynamical from static features [...]." (p. 8), i. e. WaveNet is capable of distinguishing between the spoken text and modes of delivery (modulation, speed, pitch, mood, etc.) to maintain during the conversion from one voice to another on the one hand, and the basic features of both source and target voices that it is required to swap on the other. The January 2019 follow-up paper Unsupervised speech representation learning using WaveNet autoencoders details a method to successfully enhance the proper automatic recognition and discrimination between dynamical and static features for "content swapping", notably including swapping voices on existing audio recordings, in order to make it more reliable. Another follow-up paper, Sample Efficient Adaptive Text-to-Speech, dated September 2018 (latest revision January 2019), states that DeepMind has successfully reduced the minimum amount of real-life recordings required to sample an existing voice via WaveNet to "merely a few minutes of audio data" while maintaining high-quality results. Its ability to clone voices has raised ethical concerns about WaveNet's ability to mimic the voices of living and dead persons. According to a 2016 BBC article, companies working on similar voice-cloning technologies (such as Adobe Voco) intend to insert watermarking inaudible to humans to prevent counterfeiting, while maintaining that voice cloning satisfying, for instance, the needs of entertainment-industry purposes would be of a far lower complexity and use different methods than required to fool forensic evidencing methods and electronic ID devices, so that natural voices and voices cloned for entertainment-industry purposes could still be easily told apart by technological analysis. == Applications == At the time of its release, DeepMind said that WaveNet required too much computational processing power to be used in real world applications. As of October 2017, Google announced a 1,000-fold performance improvement along with better voice quality. WaveNet was then used to generate Google Assistant voices for US English and Japanese across all Google platforms. In November 2017, DeepMind researchers released a research paper detailing a proposed method of "generating high-fidelity speech samples at more than 20 times faster than real-time", called "Probability Density Distillation". At the annual I/O developer conference in May 2018, it was announced that new Google Assistant voices were available and made possible by WaveNet; WaveNet greatly reduced the number of audio recordings that were required to create a voice model by modeling the raw audio of the voice actor samples.

    Read more →
  • Shader lamps

    Shader lamps

    Shader lamps is a computer graphic technique used to change the appearance of physical objects. The still or moving objects are illuminated, using one or more video projectors, by static or animated texture or video stream. The method was invented at University of North Carolina at Chapel Hill by Ramesh Raskar, Greg Welch, Kok-lim Low and Deepak Bandyopadhyay in 1999 [1] as a follow on to Spatial Augmented Reality [2] also invented at University of North Carolina at Chapel Hill in 1998 by Ramesh Raskar, Greg Welch and Henry Fuchs. A 3D graphic rendering software is typically used to compute the deformation caused by the non perpendicular, non-planar or even complex projection surface. Complex objects (or aggregation of multiple simple objects) create self shadows that must be compensated by using several projectors. The objects are typically replaced by neutral color ones, the projection giving all its visual properties, thus the name shader lamps. The technique can be used to create a sense of invisibility, by rendering transparency. The object is illuminated not by a replacement of its own visual properties, but by the corresponding visual surface placed behind the object as seen from an arbitrary viewing point.

    Read more →
  • Night Sky (app)

    Night Sky (app)

    Night Sky (app) is an application developed and published by indie studio iCandi Apps Ltd. from the UK. Night Sky is a stargazing reference app, where the user can explore a virtual representation of the night sky to identify stars, planets, constellations and satellites. The app is developed specifically for iOS, tvOS and watchOS devices. Night Sky was first released on November 1, 2011 for iOS, and has had multiple updates since launch. Night Sky was mentioned in the September 2016 Apple Keynote during the Apple Watch Series 2 announcement. In October 2016, Night Sky was featured as the Free App of The Week on the Apple App Store. == Reception == Night Sky was featured in Apple's 'Best of 2012' and has also been pre-installed onto iPads in Apple retail stores worldwide.

    Read more →
  • Deep learning in photoacoustic imaging

    Deep learning in photoacoustic imaging

    Photoacoustic imaging (PA) is based on the photoacoustic effect, in which optical absorption causes a rise in temperature, which causes a subsequent rise in pressure via thermo-elastic expansion. This pressure rise propagates through the tissue and is sensed via ultrasonic transducers. Due to the proportionality between the optical absorption, the rise in temperature, and the rise in pressure, the ultrasound pressure wave signal can be used to quantify the original optical energy deposition within the tissue. Photoacoustic imaging has applications of deep learning in both photoacoustic computed tomography (PACT) and photoacoustic microscopy (PAM). PACT utilizes wide-field optical excitation and an array of unfocused ultrasound transducers. Similar to other computed tomography methods, the sample is imaged at multiple view angles, which are then used to perform an inverse reconstruction algorithm based on the detection geometry (typically through universal backprojection, modified delay-and-sum, or time reversal ) to elicit the initial pressure distribution within the tissue. PAM on the other hand uses focused ultrasound detection combined with weakly focused optical excitation (acoustic resolution PAM or AR-PAM) or tightly focused optical excitation (optical resolution PAM or OR-PAM). PAM typically captures images point-by-point via a mechanical raster scanning pattern. At each scanned point, the acoustic time-of-flight provides axial resolution while the acoustic focusing yields lateral resolution. == Applications of deep learning in PACT == The first application of deep learning in PACT was by Reiter et al. in which a deep neural network was trained to learn spatial impulse responses and locate photoacoustic point sources. The resulting mean axial and lateral point location errors on 2,412 of their randomly selected test images were 0.28 mm and 0.37 mm respectively. After this initial implementation, the applications of deep learning in PACT have branched out primarily into removing artifacts from acoustic reflections, sparse sampling, limited-view, and limited-bandwidth. There has also been some recent work in PACT toward using deep learning for wavefront localization. There have been networks based on fusion of information from two different reconstructions to improve the reconstruction using deep learning fusion based networks. === Using deep learning to locate photoacoustic point sources === Traditional photoacoustic beamforming techniques modeled photoacoustic wave propagation by using detector array geometry and the time-of-flight to account for differences in the PA signal arrival time. However, this technique failed to account for reverberant acoustic signals caused by acoustic reflection, resulting in acoustic reflection artifacts that corrupt the true photoacoustic point source location information. In Reiter et al., a convolutional neural network (similar to a simple VGG-16 style architecture) was used that took pre-beamformed photoacoustic data as input and outputted a classification result specifying the 2-D point source location. ==== Deep learning for PA wavefront localization ==== Johnstonbaugh et al. was able to localize the source of photoacoustic wavefronts with a deep neural network. The network used was an encoder-decoder style convolutional neural network. The encoder-decoder network was made of residual convolution, upsampling, and high field-of-view convolution modules. A Nyquist convolution layer and differentiable spatial-to-numerical transform layer were also used within the architecture. Simulated PA wavefronts served as the input for training the model. To create the wavefronts, the forward simulation of light propagation was done with the NIRFast toolbox and the light-diffusion approximation, while the forward simulation of sound propagation was done with the K-Wave toolbox. The simulated wavefronts were subjected to different scattering mediums and Gaussian noise. The output for the network was an artifact free heat map of the targets axial and lateral position. The network had a mean error rate of less than 30 microns when localizing target below 40 mm and had a mean error rate of 1.06 mm for localizing targets between 40 mm and 60 mm. With a slight modification to the network, the model was able to accommodate multi target localization. A validation experiment was performed in which pencil lead was submerged into an intralipid solution at a depth of 32 mm. The network was able to localize the lead's position when the solution had a reduced scattering coefficient of 0, 5, 10, and 15 cm−1. The results of the network show improvements over standard delay-and-sum or frequency-domain beamforming algorithms and Johnstonbaugh proposes that this technology could be used for optical wavefront shaping, circulating melanoma cell detection, and real-time vascular surgeries. === Removing acoustic reflection artifacts (in the presence of multiple sources and channel noise) === Building on the work of Reiter et al., Allman et al. utilized a full VGG-16 architecture to locate point sources and remove reflection artifacts within raw photoacoustic channel data (in the presence of multiple sources and channel noise). This utilization of deep learning trained on simulated data produced in the MATLAB k-wave library, and then later reaffirmed their results on experimental data. === Ill-posed PACT reconstruction === In PACT, tomographic reconstruction is performed, in which the projections from multiple solid angles are combined to form an image. When reconstruction methods like filtered backprojection or time reversal, are ill-posed inverse problems due to sampling under the Nyquist-Shannon's sampling requirement or with limited-bandwidth/view, the resulting reconstruction contains image artifacts. Traditionally these artifacts were removed with slow iterative methods like total variation minimization, but the advent of deep learning approaches has opened a new avenue that utilizes a priori knowledge from network training to remove artifacts. In the deep learning methods that seek to remove these sparse sampling, limited-bandwidth, and limited-view artifacts, the typical workflow involves first performing the ill-posed reconstruction technique to transform the pre-beamformed data into a 2-D representation of the initial pressure distribution that contains artifacts. Then, a convolutional neural network (CNN) is trained to remove the artifacts, in order to produce an artifact-free representation of the ground truth initial pressure distribution. ==== Using deep learning to remove sparse sampling artifacts ==== When the density of uniform tomographic view angles is under what is prescribed by the Nyquist-Shannon's sampling theorem, it is said that the imaging system is performing sparse sampling. Sparse sampling typically occurs as a way of keeping production costs low and improving image acquisition speed. The typical network architectures used to remove these sparse sampling artifacts are U-net and Fully Dense (FD) U-net. Both of these architectures contain a compression and decompression phase. The compression phase learns to compress the image to a latent representation that lacks the imaging artifacts and other details. The decompression phase then combines with information passed by the residual connections in order to add back image details without adding in the details associated with the artifacts. FD U-net modifies the original U-net architecture by including dense blocks that allow layers to utilize information learned by previous layers within the dense block. Another technique was proposed using a simple CNN based architecture for removal of artifacts and improving the k-wave image reconstruction. ==== Removing limited-view artifacts with deep learning ==== When a region of partial solid angles are not captured, generally due to geometric limitations, the image acquisition is said to have limited-view. As illustrated by the experiments of Davoudi et al., limited-view corruptions can be directly observed as missing information in the frequency domain of the reconstructed image. Limited-view, similar to sparse sampling, makes the initial reconstruction algorithm ill-posed. Prior to deep learning, the limited-view problem was addressed with complex hardware such as acoustic deflectors and full ring-shaped transducer arrays, as well as solutions like compressed sensing, weighted factor, and iterative filtered backprojection. The result of this ill-posed reconstruction is imaging artifacts that can be removed by CNNs. The deep learning algorithms used to remove limited-view artifacts include U-net and FD U-net, as well as generative adversarial networks (GANs) and volumetric versions of U-net. One GAN implementation of note improved upon U-net by using U-net as a generator and VGG as a discriminator, with the Wasserstein metric and gradient penalty to stabilize training (WGAN-GP). ==== Pixel-wise interpolation

    Read more →