AI Chat Prompt Generator

AI Chat Prompt Generator — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Neural style transfer

    Neural style transfer

    Neural style transfer (NST) software algorithms are able to manipulate digital images, or videos, in order to adopt the appearance or visual style of another image. NST algorithms are characterized by their use of deep neural networks for the sake of image transformation. Common uses for NST are the creation of artificial artwork from photographs, for example by transferring the appearance of famous paintings to user-supplied photographs. Several notable mobile apps use NST techniques for this purpose, including DeepArt and Prisma. This method has been used by artists and designers around the globe to develop new artwork based on existent style(s). == History == NST is an example of image stylization, a problem studied for over two decades within the field of non-photorealistic rendering. The first two example-based style transfer algorithms were image analogies and image quilting. Both of these methods were based on patch-based texture synthesis algorithms. Given a training pair of images–a photo and an artwork depicting that photo–a transformation could be learned and then applied to create new artwork from a new photo, by analogy. If no training photo was available, it would need to be produced by processing the input artwork; image quilting did not require this processing step, though it was demonstrated on only one style. NST was first published in the paper "A Neural Algorithm of Artistic Style" by Leon Gatys et al., originally released to ArXiv 2015, and subsequently accepted by the peer-reviewed CVPR conference in 2016. The original paper used a VGG-19 architecture that has been pre-trained to perform object recognition using the ImageNet dataset. In 2017, Google AI introduced a method that allows a single deep convolutional style transfer network to learn multiple styles at the same time. This algorithm permits style interpolation in real-time, even when done on video media. == Mathematics == This section closely follows the original paper. === Overview === The idea of Neural Style Transfer (NST) is to take two images—a content image p → {\displaystyle {\vec {p}}} and a style image a → {\displaystyle {\vec {a}}} —and generate a third image x → {\displaystyle {\vec {x}}} that minimizes a weighted combination of two loss functions: a content loss L content ( p → , x → ) {\displaystyle {\mathcal {L}}_{\text{content }}({\vec {p}},{\vec {x}})} and a style loss L style ( a → , x → ) {\displaystyle {\mathcal {L}}_{\text{style }}({\vec {a}},{\vec {x}})} . The total loss is a linear sum of the two: L NST ( p → , a → , x → ) = α L content ( p → , x → ) + β L style ( a → , x → ) {\displaystyle {\mathcal {L}}_{\text{NST}}({\vec {p}},{\vec {a}},{\vec {x}})=\alpha {\mathcal {L}}_{\text{content}}({\vec {p}},{\vec {x}})+\beta {\mathcal {L}}_{\text{style}}({\vec {a}},{\vec {x}})} By jointly minimizing the content and style losses, NST generates an image that blends the content of the content image with the style of the style image. Both the content loss and the style loss measures the similarity of two images. The content similarity is the weighted sum of squared-differences between the neural activations of a single convolutional neural network (CNN) on two images. The style similarity is the weighted sum of Gram matrices within each layer (see below for details). The original paper used a VGG-19 CNN, but the method works for any CNN. === Symbols === Let x → {\textstyle {\vec {x}}} be an image input to a CNN. Let F l ∈ R N l × M l {\textstyle F^{l}\in \mathbb {R} ^{N_{l}\times M_{l}}} be the matrix of filter responses in layer l {\textstyle l} to the image x → {\textstyle {\vec {x}}} , where: N l {\textstyle N_{l}} is the number of filters in layer l {\textstyle l} ; M l {\textstyle M_{l}} is the height times the width (i.e. number of pixels) of each filter in layer l {\textstyle l} ; F i j l ( x → ) {\textstyle F_{ij}^{l}({\vec {x}})} is the activation of the i th {\textstyle i^{\text{th}}} filter at position j {\textstyle j} in layer l {\textstyle l} . A given input image x → {\textstyle {\vec {x}}} is encoded in each layer of the CNN by the filter responses to that image, with higher layers encoding more global features, but losing details on local features. === Content loss === Let p → {\textstyle {\vec {p}}} be an original image. Let x → {\textstyle {\vec {x}}} be an image that is generated to match the content of p → {\textstyle {\vec {p}}} . Let P l {\textstyle P^{l}} be the matrix of filter responses in layer l {\textstyle l} to the image p → {\textstyle {\vec {p}}} . The content loss is defined as the squared-error loss between the feature representations of the generated image and the content image at a chosen layer l {\displaystyle l} of a CNN: L content ( p → , x → , l ) = 1 2 ∑ i , j ( A i j l ( x → ) − A i j l ( p → ) ) 2 {\displaystyle {\mathcal {L}}_{\text{content }}({\vec {p}},{\vec {x}},l)={\frac {1}{2}}\sum _{i,j}\left(A_{ij}^{l}({\vec {x}})-A_{ij}^{l}({\vec {p}})\right)^{2}} where A i j l ( x → ) {\displaystyle A_{ij}^{l}({\vec {x}})} and A i j l ( p → ) {\displaystyle A_{ij}^{l}({\vec {p}})} are the activations of the i th {\displaystyle i^{\text{th}}} filter at position j {\displaystyle j} in layer l {\displaystyle l} for the generated and content images, respectively. Minimizing this loss encourages the generated image to have similar content to the content image, as captured by the feature activations in the chosen layer. The total content loss is a linear sum of the content losses of each layer: L content ( p → , x → ) = ∑ l v l L content ( p → , x → , l ) {\displaystyle {\mathcal {L}}_{\text{content }}({\vec {p}},{\vec {x}})=\sum _{l}v_{l}{\mathcal {L}}_{\text{content }}({\vec {p}},{\vec {x}},l)} , where the v l {\displaystyle v_{l}} are positive real numbers chosen as hyperparameters. === Style loss === The style loss is based on the Gram matrices of the generated and style images, which capture the correlations between different filter responses at different layers of the CNN: L style ( a → , x → ) = ∑ l = 0 L w l E l , {\displaystyle {\mathcal {L}}_{\text{style }}({\vec {a}},{\vec {x}})=\sum _{l=0}^{L}w_{l}E_{l},} where E l = 1 4 N l 2 M l 2 ∑ i , j ( G i j l ( x → ) − G i j l ( a → ) ) 2 . {\displaystyle E_{l}={\frac {1}{4N_{l}^{2}M_{l}^{2}}}\sum _{i,j}\left(G_{ij}^{l}({\vec {x}})-G_{ij}^{l}({\vec {a}})\right)^{2}.} Here, G i j l ( x → ) {\displaystyle G_{ij}^{l}({\vec {x}})} and G i j l ( a → ) {\displaystyle G_{ij}^{l}({\vec {a}})} are the entries of the Gram matrices for the generated and style images at layer l {\displaystyle l} . Explicitly, G i j l ( x → ) = ∑ k F i k l ( x → ) F j k l ( x → ) {\displaystyle G_{ij}^{l}({\vec {x}})=\sum _{k}F_{ik}^{l}({\vec {x}})F_{jk}^{l}({\vec {x}})} Minimizing this loss encourages the generated image to have similar style characteristics to the style image, as captured by the correlations between feature responses in each layer. The idea is that activation pattern correlations between filters in a single layer captures the "style" on the order of the receptive fields at that layer. Similarly to the previous case, the w l {\displaystyle w_{l}} are positive real numbers chosen as hyperparameters. === Hyperparameters === In the original paper, they used a particular choice of hyperparameters. The style loss is computed by w l = 0.2 {\displaystyle w_{l}=0.2} for the outputs of layers conv1_1, conv2_1, conv3_1, conv4_1, conv5_1 in the VGG-19 network, and zero otherwise. The content loss is computed by w l = 1 {\displaystyle w_{l}=1} for conv4_2, and zero otherwise. The ratio α / β ∈ [ 5 , 50 ] × 10 − 4 {\displaystyle \alpha /\beta \in [5,50]\times 10^{-4}} . === Training === Image x → {\displaystyle {\vec {x}}} is initially approximated by adding a small amount of white noise to input image p → {\displaystyle {\vec {p}}} and feeding it through the CNN. Then we successively backpropagate this loss through the network with the CNN weights fixed in order to update the pixels of x → {\displaystyle {\vec {x}}} . After several thousand epochs of training, an x → {\displaystyle {\vec {x}}} (hopefully) emerges that matches the style of a → {\displaystyle {\vec {a}}} and the content of p → {\displaystyle {\vec {p}}} . As of 2017, when implemented on a GPU, it takes a few minutes to converge. == Extensions == In some practical implementations, it is noted that the resulting image has too much high-frequency artifact, which can be suppressed by adding the total variation to the total loss. Compared to VGGNet, AlexNet does not work well for neural style transfer. NST has also been extended to videos. Subsequent work improved the speed of NST for images by using special-purpose normalizations. In a paper by Fei-Fei Li et al. adopted a different regularized loss metric and accelerated method for training to produce results in real-time (three orders of magnitude faster than Gatys). Their idea was to use not the pixel-based loss defined above but rather a 'perceptual loss' measuring t

    Read more →
  • Visual Turing Test

    Visual Turing Test

    The Visual Turing Test is “an operator-assisted device that produces a stochastic sequence of binary questions from a given test image”. The query engine produces a sequence of questions that have unpredictable answers given the history of questions. The test is only about vision and does not require any natural language processing. The job of the human operator is to provide the correct answer to the question or reject it as ambiguous. The query generator produces questions such that they follow a “natural story line”, similar to what humans do when they look at a picture. == History == Research in computer vision dates back to the 1960s when Seymour Papert first attempted to solve the problem. This unsuccessful attempt was referred to as the Summer Vision Project. The reason why it was not successful was because computer vision is more complicated than what people think. The complexity is in alignment with the human visual system. Roughly 50% of the human brain is devoted in processing vision, which indicates that it is a difficult problem. Later there were attempts to solve the problems with models inspired by the human brain. Perceptrons by Frank Rosenblatt, which is a form of the neural networks, was one of the first such approaches. These simple neural networks could not live up to their expectations and had certain limitations due to which they were not considered in future research. Later with the availability of the hardware and some processing power the research shifted to image processing which involves pixel-level operations, like finding edges, de-noising images or applying filters to name a few. There was some great progress in this field but the problem of vision which was to make the machines understand the images was still not being addressed. During this time the neural networks also resurfaced as it was shown that the limitations of the perceptrons can be overcome by Multi-layer perceptrons. Also in the early 1990s convolutional neural networks were born which showed great results on digit recognition but did not scale up well on harder problems. The late 1990s and early 2000s saw the birth of modern computer vision. One of the reasons this happened was due to the availability of key, feature extraction and representation algorithms. Features along with the already present machine learning algorithms were used to detect, localise and segment objects in Images. While all these advancements were being made, the community felt the need to have standardised datasets and evaluation metrics so the performances can be compared. This led to the emergence of challenges like the Pascal VOC challenge and the ImageNet challenge. The availability of standard evaluation metrics and the open challenges gave directions to the research. Better algorithms were introduced for specific tasks like object detection and classification. Visual Turing Test aims to give a new direction to the computer vision research which would lead to the introduction of systems that will be one step closer to understanding images the way humans do. == Current evaluation practices == A large number of datasets have been annotated and generalised to benchmark performances of difference classes of algorithms to assess different vision tasks (e.g., object detection/recognition) on some image domain (e.g., scene images). One of the most famous datasets in computer vision is ImageNet which is used to assess the problem of object level Image classification. ImageNet is one of the largest annotated datasets available and has over one million images. The other important vision task is object detection and localisation which refers to detecting the object instance in the image and providing the bounding box coordinates around the object instance or segmenting the object. The most popular dataset for this task is the Pascal dataset. Similarly there are other datasets for specific tasks like the H3D dataset for human pose detection, Core dataset to evaluate the quality of detected object attributes such as colour, orientation, and activity. Having these standard datasets has helped the vision community to come up with well performing algorithms for all these tasks. The next logical step is to create a larger task encompassing of these smaller subtasks. Having such a task would lead to building systems that would understand images, as understanding images would inherently involve detecting objects, localising them and segmenting them. == Details == The Visual Turing Test (VTT) unlike the Turing test has a query engine system which interrogates a computer vision system in the presence of a human co-ordinator. It is a system that generates a random sequence of binary questions specific to the test image, such that the answer to any question k is unpredictable given the true answers to the previous k − 1 questions (also known as history of questions). The test happens in the presence of a human operator who serves two main purposes: removing the ambiguous questions and providing the correct answers to the unambiguous questions. Given an Image infinite possible binary questions can be asked and a lot of them are bound to be ambiguous. These questions if generated by the query engine are removed by the human moderator and instead the query engine generates another question such that the answer to it is unpredictable given the history of the questions. The aim of the Visual Turing Test is to evaluate the Image understanding of a computer system, and an important part of image understanding is the story line of the image. When humans look at an image, they do not think that there is a car at ‘x’ pixels from the left and ‘y’ pixels from the top, but instead they look at it as a story, for e.g. they might think that there is a car parked on the road, a person is exiting the car and heading towards a building. The most important elements of the story line are the objects and so to extract any story line from an image the first and the most important task is to instantiate the objects in it, and that is what the query engine does. === Query engine === The query engine is the core of the Visual Turing Test and it comprises two main parts : Vocabulary and Questions ==== Vocabulary ==== Vocabulary is a set of words that represent the elements of the images. This vocabulary when used with appropriate grammar leads to a set of questions. The grammar is defined in the next section in a way that it leads to a space of binary questions. The vocabulary V {\displaystyle {\mathcal {V}}} consist of three components: Types of Objects T {\displaystyle {\mathcal {T}}} Type-dependent attributes of objects A ( t ) {\displaystyle {\mathcal {A}}(t)} Type-dependent relationships between two objects R ( t , t ′ ) {\displaystyle {\mathcal {R}}(t,t')} For Images of urban street scenes the types of objects include people, vehicle and buildings. Attributes refer to the properties of these objects, for e.g. female, child, wearing a hat or carrying something, for people and moving, parked, stopped, one tire visible or two tires visible for vehicles. Relationships between each pair of object classes can be either “ordered” or “unordered”. The unordered relationships may include talking, walking together and the ordered relationships include taller, closer to the camera, occluding, being occluded etc. Additionally all of this vocabulary is used in context of rectangular image regions w \in W which allow for the localisation of objects in the image. An extremely large number of such regions are possible and this complicates the problem, so for this test, regions at specific scales are only used which include 1/16 the size of image, 1/4 the size of image, 1/2 the size of image or larger. ==== Questions ==== The question space is composed of four types of questions: Existence questions: The aim of the existence questions is to find new objects in the image that have not been uniquely identified previously. They are of the form : Qexist = 'Is there an instance of an object of type t with attributes A partially visible in region w that was not previously instantiated?' Uniqueness questions: A uniqueness question tries to uniquely identify an object to instantiate it. Quniq = 'Is there a unique instance of an object of type t with attributes A partially visible in region w that was not previously instantiated?' The uniqueness questions along with the existence questions form the instantiation questions. As mentioned earlier instantiating objects leads to other interesting questions and eventually a story line. Uniqueness questions follow the existence questions and a positive answer to it leads to instantiation of an object. Attribute questions: An attribute question tries to find more about the object once it has been instantiated. Such questions can query about a single attribute, conjunction of two attributes or disjunction of two attributes. Qatt(ot) = {'Does object ot have attribute a?' , 'Does object

    Read more →
  • Pill reminder

    Pill reminder

    A pill reminder is any device that reminds users to take medications. Traditional pill reminders are pill containers with electric timers attached, which can be preset for certain times of the day to set off an alarm. More sophisticated pill reminders can also detect when they have been opened, and therefore when the user is away during the time they were supposed to take their medication, they will be reminded of it when they return. This reminder can be in the form of a light, which also helps for deaf or hearing-impaired users. == Mobile app == A newer type of pill reminder is a mobile app that reminds the owner to take the medication. Some of these applications might effectively support adherence to taking medications.

    Read more →
  • Digital image processing

    Digital image processing

    Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions (perhaps more), digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); and third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased. == History == Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement. The purpose of early image processing was to improve the quality of the image. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing includes image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon. The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest. === Image sensors === The basis for modern image sensors is metal–oxide–semiconductor (MOS) technology, invented at Bell Labs between 1955 and 1960, This led to the development of digital semiconductor image sensors, including the charge-coupled device (CCD) and later the CMOS sensor. The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting. The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993. By 2007, sales of CMOS sensors had surpassed CCD sensors. MOS image sensors are widely used in optical mouse technology. The first optical mouse, invented by Richard F. Lyon at Xerox in 1980, used a 5 μm NMOS integrated circuit sensor chip. Since the first commercial optical mouse, the IntelliMouse introduced in 1999, most optical mouse devices use CMOS sensors. === Image compression === An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972. DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day as of 2015. Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression. JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data. === Digital signal processor (DSP) === Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s. MOS integrated circuit technology was the basis for the first single-chip microprocessors and microcontrollers in the early 1970s, and then the first single-chip digital signal processor (DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing. The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, signaling, analog-to-digital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, inter-frame prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder chips. == Tasks == Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means. In particular, digital image processing is a concrete application of, and a practical technology based on: Classification Feature extraction Multi-scale signal analysis Pattern recognition Projection Some techniques that are used in digital image processing include: Anisotropic diffusion Hidden Markov models Image editing Image restoration Independent component analysis Linear filtering Neural networks Partial differential equations Pixelation Point feature matching Principal components analysis Self-organizing maps Wavelets == Digital image transformations == === Filtering === Digital filters are used to blur and sharpen digital images. Filtering can be performed by: convolution with specifically designed kernels (filter array) in the spatial domain masking specific frequency regions in the frequency (Fourier) domain The following examples show both methods: ==== Image padding in Fourier domain filtering ==== Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques: Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding. ==== Filtering code examples ==== MATLAB example for spatial domain highpass filtering. === Affine transformations === Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples: To apply the affine

    Read more →
  • EXAPT

    EXAPT

    EXAPT (a portmanteau of "Extended Subset of APT") is a production-oriented programming language that allows users to generate NC programs with control information for machining tools and facilitates decision-making for production-related issues that may arise during various machining processes. EXAPT was first developed to address industrial requirements. Through the years, the company created additional software for the manufacturing industry. Today, EXAPT offers a suite of SAAS products and services for the manufacturing industry. The trade name, EXAPT, is most commonly associated with the CAD/CAM-System, production data, and tool management software of the German company EXAPT Systemtechnik GmbH based in Aachen, DE. == General == EXAPT is a modularly built programming system for all NC machining operations as Drilling Turning Milling Turn-Milling Nibbling Flame-, laser-, plasma- and water jet cutting Wire eroding Operations with industrial robots Due to the modular structure, the main product groups, EXAPTcam and EXAPTpdo, are gradually expandable and permit individual software for the manufacturing industry used individually and also in a compound with an existing IT environment. == Functionality == EXAPTcam meets the requirements for NC planning, especially for the cutting operations such as turning, drilling, and milling up to 5-axis simultaneous machining. Thereby new process technologies, tool, and machine concepts are constantly involved. In the NC programming data from different sources such as 3D CAD models, drawings or tables can flow in. The possibilities of NC programming reaches from language-oriented to feature-oriented NC programming. The integrated EXAPT knowledge database and intelligent and scalable automatisms support the user. The EXAPT NC planning also covers the generation of production information as clamping and tool plans, presetting data or time calculations. The realistic simulation possibilities of NC planning and NC control data provide with production reliability. EXAPTpdo (EXAPT ProductionsDataOrganization) provides a neutrally applicable technology platform for the information compound of the NC planning - to the shop floor. This applies to all NC production data that are necessary for the set-up of NC machines, for the provision, presetting, and stocking of manufacturing resources and provided by EXAPTpdo in a central database. Besides classical functions of the tool management system (TMS) as the management of cutting tools, measuring, testing and clamping devices the technology data management and tool lifecycle management (TLM) is also included. System-supported "where-used lists" helps to handle the manufacturing resource cycle by secured requirement determination and requirement fulfillment. Unnecessary transports and unplanned dispositive adjustments are dropped, stocks are reduced, set-up times reduced and the throughput is increased. EXAPTpdo synchronizes involved systems within the value chain. Stock systems, MES systems or ERP systems (e.g. from the purchasing or production areas) do not work in isolation from each other but they interact with each other. EXAPTpdo provides the base to Smart Factory, for more flexibility in production and faster communication. == History == With the foundation of the EXAPT-Verein in 1967 as spin-off of the universities Aachen, Berlin and Stuttgart the further development "EXAPT (EXtended Subset of APT)" of the programming language "APT (Automatically Programmed Tool)" was focused and so the first milestone for the EXAPT history was set. In the same year the system EXAPT 1 for drilling and simple milling tasks became available. 1969 The industrial application of EXAPT 2 for the programming of NC machines with 2-axis linear and path control begins. In the following year, the development of the EXAPT modular system starts. 1972 BASIC-EXAPT is provided for the universal, homogeneous programming of all NC tasks. The support is made by the EXAPT applications consultancy. 1973 EXAPT 1.1 is provided for the programming of straight-cut and continuous-path controlled drilling and milling machines and machining centers. At the Hanover Fair (IHA 73) the interactive access to a mainframe via a time-sharing terminal for the part program entry and correction is presented and starts the replacement of the punch card. 1974 The possibilities for the use of process computers for the NC data transfer are leveled out. EXAPT offers the possibility of the result simulation when using plotters with display of tool paths and tools in assignment to the workpiece. In April 1975, the EXAPT NC Systemtechnik GmbH was founded with the aim, of enabling entry into the NC technique for small and medium-sized companies by a complete product and service program. In the following year, the system portfolio is extended with further system modules and service programs and the provision of postprocessors. 1978 The development activities on the EXAPT module system started in 1970 are completed. Using modern software techniques, the different system parts BASIC-EXAPT, EXAPT 1, EXAPT 1.1, and EXAPT 2 are composed of a total system. System support and applications consultancy become a new working focus. From the beginning to the middle of the 1980s Beside new portable software modules for CAD/CAM applications (e. g. CAPEX, NESTEX, CADEX, CADCPL), the first version of the EXAPT DNC system and extensions of the EXAPT NC programming system for the machining of sculptured surfaces are presented. 1988 EXAPT expands the software product range by systems for tool data management (BMO) and production data management (FDO). EXAPT trains more than 1,300 course participants including company-specific courses. 1992 The first version of the completely new product generation EXAPTplus is presented and the agency in Dresden is opened. 1993 The company name "EXAPT NC Systemtechnik GmbH" is changed to "EXAPT Systemtechnik GmbH." EXAPTplus is presented on PC under Windows NT at the EMO '93. The decentralization of the use of EXAPT systems expands the range of applications. In the following year, EXAPT-DNC is executable under Windows on a customary PC. Special hardware is not needed and so it can be used in compound with the database-supported EXAPT production data management system (FDO). 1995 EXAPTplus is also ready for complex application cases such as machining of tubes at extrusion tools. EXAPT-CADI provides the transfer of 2D CAD data to EXAPTplus. With the new office Gießen the marketing is strengthened. In the following year the EXAPT NC editor is developed for the direct processing of NC control data with tool path display and visualization of the tools. In the course of the market entry of more comfortable 3D CAD systems for the solid modelling of components a detailed evaluation of current systems is made in 1997. It is decided to use SolidWorks as a reference system for the solid-oriented NC planning with EXAPT. 1998 The first solution for the transfer of geometry data between SolidWorks and EXAPTplus is generated. The EXAPT organization systems are (beside SQL) also executable under Oracle now. The use of client server solutions supports the data flow in the production. 1999 AFR functions are provided in connection with EXAPTsolid to support a workpiece modelling for NC. The millennium capability is ensured for all EXAPT systems. AFR is a ground-breaking for the integration of third-party products. 2002 EXAPT-BMG is developed for the generation and visualization of tools with additional functions for the assembly from components. The acquisition of tools with their geometric and technological presentation offers extensive support of the NC planning with EXAPT systems. 2003 EXAPTpdo is available to optimize the process chains in production planning and production execution optimally regarding the increasing requirements of changing production conditions. 2004 Diverse system extensions are made in EXAPTplus, EXAPTsolid, EXAPT NC editor, EXAPTpdo for the complete machining on turning/milling centres with result reliability because of more extensive simulation based on realNC (Tecnomatix), for the use of new complex tool systems and the compound use between ERP systems as SAP and intelligent CNC systems. In the following year, EXAPTpdo is extended for the cross-order set-up optimization and provision of manufacturing re-sources especially for single and small series production with connection to purchase and physical portfolio management. 2006 The EXAPT systems are available for extended use as an information platform for production, the time management, and similar requirements. EXAPTsolid is extended for the feature-oriented milling operation and machine simulation. The NC programming of complex machine tools, e.g. three-turret-turning/milling centers is supported by EXAPT systems, as well as the use of multi-functional tools. 2007 A module for 3-5-axis simultaneous milling machining is presented.

    Read more →
  • Language model benchmark

    Language model benchmark

    A language model benchmark is a standardized test designed to evaluate the performance of language models on various natural language processing tasks. These tests are intended for comparing different models' capabilities in areas such as language understanding, generation, and reasoning. Benchmarks generally consist of a dataset and corresponding evaluation metrics. The dataset provides text samples and annotations, while the metrics measure a model's performance on tasks like answering questions, text classification, and machine translation. These benchmarks are developed and maintained by academic institutions, research organizations, and industry players to track progress in the field. In addition to accuracy, the metrics can include throughput, energy efficiency, bias, trust, and sustainability. == Overview == === Types === Benchmarks may be described by the following adjectives, not mutually exclusive: Classical: These tasks are studied in natural language processing, even before the advent of deep learning. Examples include the Penn Treebank for testing syntactic and semantic parsing, as well as bilingual translation benchmarked by BLEU scores. Question answering: These tasks have a text question and a text answer, often multiple-choice. They can be open-book or closed-book. Open-book QA resembles reading comprehension questions, with relevant passages included as annotation in the question, in which the answer appears. Closed-book QA includes no relevant passages. Closed-book QA is also called open-domain question-answering. Before the era of large language models, open-book QA was more common, and understood as testing information retrieval methods. Closed-book QA became common since GPT-2 as a method to measure knowledge stored within model parameters. Omnibus: An omnibus benchmark combines many benchmarks, often previously published. It is intended as an all-in-one benchmarking solution. Reasoning: These tasks are usually in the question-answering format, but are intended to be more difficult than standard question answering. Multimodal: These tasks require processing not only text, but also other modalities, such as images and sound. Examples include OCR and transcription. Agency: These tasks are for a language-model–based software agent that operates a computer for a user, such as editing images, browsing the web, etc. Adversarial: A benchmark is "adversarial" if the items in the benchmark are picked specifically so that certain models do badly on them. Adversarial benchmarks are often constructed after state of the art (SOTA) models have saturated (achieved 100% performance) a benchmark, to renew the benchmark. A benchmark is "adversarial" only at a certain moment in time, since what is adversarial may cease to be adversarial as newer SOTA models appear. Public/Private: A benchmark might be partly or entirely private, meaning that some or all of the questions are not publicly available. The idea is that if a question is publicly available, then it might be used for training, which would be "training on the test set" and invalidate the result of the benchmark. Usually, only the guardians of the benchmark have access to the private subsets, and to score a model on such a benchmark, one must send the model weights, or provide API access, to the guardians. The boundary between a benchmark and a dataset is not sharp. Generally, a dataset contains three "splits": training, test, and validation. Both the test and validation splits are essentially benchmarks. In general, a benchmark is distinguished from a test/validation dataset in that a benchmark is typically intended to be used to measure the performance of many different models that are not trained specifically for doing well on the benchmark, while a test/validation set is intended to be used to measure the performance of models trained specifically on the corresponding training set. In other words, a benchmark may be thought of as a test/validation set without a corresponding training set. Conversely, certain benchmarks may be used as a training set, such as the English Gigaword or the One Billion Word Benchmark, which in modern language is just the negative log-likelihood loss on a pretraining set with 1 billion words. Indeed, the distinction between benchmark and dataset in language models became sharper after the rise of the pretraining paradigm, whereby a model is first trained on massive, unlabeled datasets to learn general language patterns, syntax, and knowledge (pretraining), and the base model is then adapted to specific, downstream tasks using smaller, labeled datasets (fine-tuning). === Lifecycle === Generally, the life cycle of a benchmark consists of the following steps: Inception: A benchmark is published. It can be simply given as a demonstration of the power of a new model (implicitly) that others then picked up as a benchmark, or as a benchmark that others are encouraged to use (explicitly). Growth: More papers and models use the benchmark, and the performance on the benchmark grows. Maturity, degeneration or deprecation: A benchmark may be saturated, after which researchers move on to other benchmarks. Progress on the benchmark may also be neglected as the field moves to focus on other benchmarks. Renewal: A saturated benchmark can be upgraded to make it no longer saturated, allowing further progress. === Construction === Like datasets, benchmarks are typically constructed by several methods, individually or in combination: Web scraping: Ready-made question-answer pairs may be scraped online, such as from websites that teach mathematics and programming. Conversion: Items may be constructed programmatically from scraped web content, such as by blanking out named entities from sentences, and asking the model to fill in the blank. This was used for making the CNN/Daily Mail Reading Comprehension Task. Crowd sourcing: Items may be constructed by paying people to write them, such as on Amazon Mechanical Turk. This was used for making the MCTest. === Evaluation === Generally, benchmarks are fully automated. This limits the questions that can be asked. For example, with mathematical questions, "proving a claim" would be difficult to automatically check, while "calculate an answer with a unique integer answer" would be automatically checkable. With programming tasks, the answer can generally be checked by running unit tests, with an upper limit on runtime. The benchmark scores are of the following kinds: For multiple choice or cloze questions, common scores are accuracy (frequency of correct answer), precision, recall, F1 score, etc. pass@n: The model is given n {\displaystyle n} attempts to solve each problem. If any attempt is correct, the model earns a point. The pass@n score is the model's average score over all problems. k@n: The model makes n {\displaystyle n} attempts to solve each problem, but only k {\displaystyle k} attempts out of them are selected for submission. If any submission is correct, the model earns a point. The k@n score is the model's average score over all problems. cons@n: The model is given n {\displaystyle n} attempts to solve each problem. If the most common answer is correct, the model earns a point. The cons@n score is the model's average score over all problems. Here "cons" stands for "consensus" or "majority voting". The pass@n score can be estimated more accurately by making N > n {\displaystyle N>n} attempts, and use the unbiased estimator 1 − ( N − c n ) ( N n ) {\displaystyle 1-{\frac {\binom {N-c}{n}}{\binom {N}{n}}}} , where c {\displaystyle c} is the number of correct attempts. For less well-formed tasks, where the output can be any sentence, there are the following commonly used scores including BLEU ROUGE, METEOR, NIST, word error rate, LEPOR, CIDEr, and SPICE. === Issues === error: Some benchmark answers may be wrong. ambiguity: Some benchmark questions may be ambiguously worded. subjective: Some benchmark questions may not have an objective answer at all. This problem generally prevents creative writing benchmarks. Similarly, this prevents benchmarking writing proofs in natural language, though benchmarking proofs in a formal language is possible. open-ended: Some benchmark questions may not have a single answer of a fixed size. This problem generally prevents programming benchmarks from using more natural tasks such as "write a program for X", and instead uses tasks such as "write a function that implements specification X". inter-annotator agreement: Some benchmark questions may be not fully objective, such that even people would not agree with 100% on what the answer should be. This is common in natural language processing tasks, such as syntactic annotation. shortcut: Some benchmark questions may be easily solved by an "unintended" shortcut. For example, in the SNLI benchmark, having a negative word like "not" in the second sentence is a strong signal for the "Contradiction" category, regardless of what the se

    Read more →
  • Phase correlation

    Phase correlation

    Phase correlation is an approach to estimate the relative translative offset between two similar images (digital image correlation) or other data sets. It is commonly used in image registration and relies on a frequency-domain representation of the data, usually calculated by fast Fourier transforms. The term is applied particularly to a subset of cross-correlation techniques that isolate the phase information from the Fourier-space representation of the cross-correlogram. == Example == The following image demonstrates the usage of phase correlation to determine relative translative movement between two images corrupted by independent Gaussian noise. The image was translated by (20,23) pixels. Accordingly, one can clearly see a peak in the phase-correlation representation at approximately (20,23). == Method == Given two input images g a {\displaystyle \ g_{a}} and g b {\displaystyle \ g_{b}} : Apply a window function (e.g., a Hamming window) on both images to reduce edge effects (this may be optional depending on the image characteristics). Then, calculate the discrete 2D Fourier transform of both images. G a = F { g a } , G b = F { g b } {\displaystyle \ \mathbf {G} _{a}={\mathcal {F}}\{g_{a}\},\;\mathbf {G} _{b}={\mathcal {F}}\{g_{b}\}} Calculate the cross-power spectrum by taking the complex conjugate of the second result, multiplying the Fourier transforms together elementwise, and normalizing this product elementwise. R = G a ∘ G b ∗ | G a ∘ G b ∗ | {\displaystyle \ R={\frac {\mathbf {G} _{a}\circ \mathbf {G} _{b}^{}}{|\mathbf {G} _{a}\circ \mathbf {G} _{b}^{}|}}} Where ∘ {\displaystyle \circ } is the Hadamard product (entry-wise product) and the absolute values are taken entry-wise as well. Written out entry-wise for element index ( j , k ) {\displaystyle (j,k)} : R j k = G a , j k ⋅ G b , j k ∗ | G a , j k ⋅ G b , j k ∗ | {\displaystyle \ R_{jk}={\frac {G_{a,jk}\cdot G_{b,jk}^{}}{|G_{a,jk}\cdot G_{b,jk}^{}|}}} Obtain the normalized cross-correlation by applying the inverse Fourier transform. r = F − 1 { R } {\displaystyle \ r={\mathcal {F}}^{-1}\{R\}} Determine the location of the peak in r {\displaystyle \ r} . ( Δ x , Δ y ) = arg ⁡ max ( x , y ) { r } {\displaystyle \ (\Delta x,\Delta y)=\arg \max _{(x,y)}\{r\}} === Subpixel registration === Commonly, interpolation methods are used to estimate the peak location in the cross-correlogram to non-integer values, despite the fact that the data are discrete, and this procedure is often termed 'subpixel registration'. A large variety of subpixel interpolation methods are given in the technical literature. Common peak interpolation methods such as parabolic interpolation have been used, and the OpenCV computer vision package uses a centroid-based method, though these generally have inferior accuracy compared to more sophisticated methods. Because the Fourier representation of the data has already been computed, it is especially convenient to use the Fourier shift theorem with real-valued (sub-integer) shifts for this purpose, which essentially interpolates using the sinusoidal basis functions of the Fourier transform. An especially popular FT-based estimator is given by Foroosh et al. In this method, the subpixel peak location is approximated by a simple formula involving peak pixel value and the values of its nearest neighbors, where r ( 0 , 0 ) {\displaystyle r_{(0,0)}} is the peak value and r ( 1 , 0 ) {\displaystyle r_{(1,0)}} is the nearest neighbor in the x direction (assuming, as in most approaches, that the integer shift has already been found and the comparand images differ only by a subpixel shift). Δ x = r ( 1 , 0 ) r ( 1 , 0 ) ± r ( 0 , 0 ) {\displaystyle \ \Delta x={\frac {r_{(1,0)}}{r_{(1,0)}\pm r_{(0,0)}}}} The Foroosh et al. method is quite fast compared to most methods, though it is not always the most accurate. Some methods shift the peak in Fourier space and apply non-linear optimization to maximize the correlogram peak, but these tend to be very slow since they must apply an inverse Fourier transform or its equivalent in the objective function. It is also possible to infer the peak location from phase characteristics in Fourier space without the inverse transformation, as noted by Stone. These methods usually use a linear least squares (LLS) fit of the phase angles to a planar model. The long latency of the phase angle computation in these methods is a disadvantage, but the speed can sometimes be comparable to the Foroosh et al. method depending on the image size. They often compare favorably in speed to the multiple iterations of extremely slow objective functions in iterative non-linear methods. Since all subpixel shift computation methods are fundamentally interpolative, the performance of a particular method depends on how well the underlying data conform to the assumptions in the interpolator. This fact also may limit the usefulness of high numerical accuracy in an algorithm, since the uncertainty due to interpolation method choice may be larger than any numerical or approximation error in the particular method. Subpixel methods are also particularly sensitive to noise in the images, and the utility of a particular algorithm is distinguished not only by its speed and accuracy but its resilience to the particular types of noise in the application. == Rationale == The method is based on the Fourier shift theorem. Let the two images g a {\displaystyle \ g_{a}} and g b {\displaystyle \ g_{b}} be circularly-shifted versions of each other: g b ( x , y ) = d e f g a ( ( x − Δ x ) mod M , ( y − Δ y ) mod N ) {\displaystyle \ g_{b}(x,y)\ {\stackrel {\mathrm {def} }{=}}\ g_{a}((x-\Delta x){\bmod {M}},(y-\Delta y){\bmod {N}})} (where the images are M × N {\displaystyle \ M\times N} in size). Then, the discrete Fourier transforms of the images will be shifted relatively in phase: G b ( u , v ) = G a ( u , v ) e − 2 π i ( u Δ x M + v Δ y N ) {\displaystyle \mathbf {G} _{b}(u,v)=\mathbf {G} _{a}(u,v)e^{-2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}} One can then calculate the normalized cross-power spectrum to factor out the phase difference: R ( u , v ) = G a G b ∗ | G a G b ∗ | = G a G a ∗ e 2 π i ( u Δ x M + v Δ y N ) | G a G a ∗ e 2 π i ( u Δ x M + v Δ y N ) | = G a G a ∗ e 2 π i ( u Δ x M + v Δ y N ) | G a G a ∗ | = e 2 π i ( u Δ x M + v Δ y N ) {\displaystyle {\begin{aligned}R(u,v)&={\frac {\mathbf {G} _{a}\mathbf {G} _{b}^{}}{|\mathbf {G} _{a}\mathbf {G} _{b}^{}|}}\\&={\frac {\mathbf {G} _{a}\mathbf {G} _{a}^{}e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}}{|\mathbf {G} _{a}\mathbf {G} _{a}^{}e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}|}}\\&={\frac {\mathbf {G} _{a}\mathbf {G} _{a}^{}e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}}{|\mathbf {G} _{a}\mathbf {G} _{a}^{}|}}\\&=e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}\end{aligned}}} since the magnitude of an imaginary exponential always is one, and the phase of G a G a ∗ {\displaystyle \ \mathbf {G} _{a}\mathbf {G} _{a}^{}} always is zero. The inverse Fourier transform of a complex exponential is a Dirac delta function, i.e. a single peak: r ( x , y ) = δ ( x + Δ x , y + Δ y ) {\displaystyle \ r(x,y)=\delta (x+\Delta x,y+\Delta y)} This result could have been obtained by calculating the cross correlation directly. The advantage of this method is that the discrete Fourier transform and its inverse can be performed using the fast Fourier transform, which is much faster than correlation for large images. === Benefits === Unlike many spatial-domain algorithms, the phase correlation method is resilient to noise, occlusions, and other defects typical of medical or satellite images. The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation. === Limitations === In practice, it is more likely that g b {\displaystyle \ g_{b}} will be a simple linear shift of g a {\displaystyle \ g_{a}} , rather than a circular shift as required by the explanation above. In such cases, r {\displaystyle \ r} will not be a simple delta function, which will reduce the performance of the method. In such cases, a window function (such as a Gaussian or Tukey window) should be employed during the Fourier transform to reduce edge effects, or the images should be zero padded so that the edge effects can be ignored. If the images consist of a flat background, with all detail situated away from the edges, then a linear shift will be equivalent to a circular shift, and the above derivation will hold exactly. The peak can be sharpened by using edge or vector correlation. For periodic images (such as a chessboard or picket fence), phase correlation may yield ambiguous results with several peaks in the resulting output. == Applications == Phase correlation is the preferred m

    Read more →
  • Recursive transition network

    Recursive transition network

    A recursive transition network ("RTN") is a graph theoretical schematic used to represent the rules of a context-free grammar. RTNs have application to programming languages, natural language and lexical analysis. Any sentence that is constructed according to the rules of an RTN is said to be "well-formed". The structural elements of a well-formed sentence may also be well-formed sentences by themselves, or they may be simpler structures. This is why RTNs are described as recursive. == Notes and references ==

    Read more →
  • Serverless computing

    Serverless computing

    Serverless computing is "a cloud service category where the customer can use different cloud capability types without the customer having to provision, deploy and manage either hardware or software resources, other than providing customer application code or providing customer data. Serverless computing represents a form of virtualized computing", according to ISO/IEC 22123-2. Serverless computing is a broad ecosystem that includes the cloud provider, function as a service (FaaS), managed services, tools, frameworks, engineers, stakeholders, and other interconnected elements. == Overview == Serverless is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. The definition of serverless computing has evolved over time, leading to varied interpretations. According to Ben Kehoe, serverless represents a spectrum rather than a rigid definition. Emphasis should shift from strict definitions and specific technologies to adopting a serverless mindset, focusing on leveraging serverless solutions to address business challenges. Serverless computing does not eliminate complexity but shifts much of it from the operations team to the development team. However, this shift is not absolute, as operations teams continue to manage aspects such as identity and access management (IAM), networking, security policies, and cost optimization. Additionally, while breaking down applications into finer-grained components can increase management complexity, the relationship between granularity and management difficulty is not strictly linear. There is often an optimal level of modularization where the benefits outweigh the added management overhead. According to Yan Cui, serverless techniques should be adopted only when they help to deliver customer value faster. And while adopting, organizations should take small steps and de-risk along the way. == Challenges == Serverless applications are prone to fallacies of distributed computing. In addition, they are prone to the following fallacies: Versioning is simple Compensating transactions always work Observability is optional === Monitoring and debugging === Monitoring and debugging serverless applications can present unique challenges due to their distributed, event-driven nature and proprietary environments. Traditional tools may fall short, making it difficult to track execution flows across services. However, modern solutions such as distributed tracing tools (e.g., AWS X-Ray, Datadog), centralized logging, and cloud-agnostic observability platforms are mitigating these challenges. Emerging technologies like OpenTelemetry, AI-powered anomaly detection, and serverless-specific frameworks are further improving visibility and root cause analysis. While challenges persist, advancements in monitoring and debugging tools are steadily addressing these limitations. === Security === According to OWASP, serverless applications are vulnerable to variations of traditional attacks, insecure code, and some serverless-specific attacks (like denial of wallet). So, the risks have changed and attack prevention requires a shift in mindset. === Vendor lock-in === Serverless computing is provided as a third-party service. Applications and software that run in the serverless environment are by default locked to a specific cloud vendor. This issue is exacerbated in serverless computing, as with its increased level of abstraction, public vendors only allow customers to upload code to a FaaS platform without the authority to configure underlying environments. More importantly, when considering a more complex workflow that includes backend-as-a-service (BaaS), a BaaS offering can typically only natively trigger a FaaS offering from the same provider. This makes the workload migration in serverless computing virtually impossible. Therefore, considering how to design and deploy serverless workflows from a multi-cloud perspective could mitigate this. == High-performance computing == Serverless computing may not be ideal for certain high-performance computing (HPC) workloads due to resource limits often imposed by cloud providers, including maximum memory, CPU, and runtime restrictions. For workloads requiring sustained or predictable resource usage, bulk-provisioned servers can sometimes be more cost-effective than the pay-per-use model typical of serverless platforms. However, serverless computing is increasingly capable of supporting specific HPC workloads, particularly those that are highly parallelizable and event-driven, by leveraging its scalability and elasticity. The suitability of serverless computing for HPC continues to evolve with advancements in cloud technologies. == Anti-patterns == The grain of sand anti-pattern refers to the creation of excessively small components (e.g., functions) within a system, often resulting in increased complexity, operational overhead, and performance inefficiencies. Lambda pinball is a related anti-pattern that can occur in serverless architectures when functions (e.g., AWS Lambda, Azure functions) excessively invoke each other in fragmented chains, leading to latency, debugging and testing challenges, and reduced observability. These anti-patterns are associated with the formation of a distributed monolith. These anti-patterns are often addressed through the application of clear domain boundaries, which distinguish between public and published interfaces. Public interfaces are technically accessible interfaces, such as methods, classes, API endpoints, or triggers, but they do not come with formal stability guarantees. In contrast, published interfaces involve an explicit stability contract, including formal versioning, thorough documentation, a defined deprecation policy, and often support for backward compatibility. Published interfaces may also require maintaining multiple versions simultaneously and adhering to formal deprecation processes when breaking changes are introduced. Fragmented chains of function calls are often observed in systems where serverless components (functions) interact with other resources in complex patterns, sometimes described as spaghetti architecture or a distributed monolith. In contrast, systems exhibiting clearer boundaries typically organize serverless components into cohesive groups, where internal public interfaces manage inter-component communication, and published interfaces define communication across group boundaries. This distinction highlights differences in stability guarantees and maintenance commitments, contributing to reduced dependency complexity. Additionally, patterns associated with excessive serverless function chaining are sometimes addressed through architectural strategies that emphasize native service integrations instead of individual functions, a concept referred to as the functionless mindset. However, this approach is noted to involve a steeper learning curve, and integration limitations may vary even within the same cloud vendor ecosystem. Reporting on serverless databases presents challenges, as retrieving data for a reporting service can either break the bounded contexts, reduce the timeliness of the data, or do both. This applies regardless of whether data is pulled directly from databases, retrieved via HTTP, or collected in batches. Mark Richards refers to this as the reach-in reporting anti-pattern. A possible alternative to this approach is for databases to asynchronously push the necessary data to the reporting service instead of the reporting service pulling it. While this method requires a separate contract between services and the reporting service and can be complex to implement, it helps preserve bounded contexts while maintaining a high level of data timeliness. == Principles == Adopting DevSecOps practices can help improve the use and security of serverless technologies. In serverless applications, the distinction between infrastructure and business logic is often blurred, with applications typically distributed across multiple services. To maximize the effectiveness of testing, integration testing is emphasized for serverless applications. Additionally, to facilitate debugging and implementation, orchestration is used within the bounded context, while choreography is employed between different bounded contexts. Ephemeral resources are typically kept together to maintain high cohesion. However, shared resources with long spin-up times, such as AWS RDS clusters and landing zones, are often managed in separate repositories, deployment pipeline, and stacks.

    Read more →
  • Ogle app

    Ogle app

    Ogle is a free smartphone based social media application. It is available for iOS and Android. Ogle acts like a school wide forum that lets users and users' classmates share and interact. Users can share photos, videos, questions, even thoughts and watch submissions grow in popularity as other users vote and comment on them. == App Features == Campus Feed: Interact by watching and posting videos or pictures to your campus story. Photos and Videos: share what you want with many different timing options. Interact: Chat with friends and groups, or share a moment for all to see. Real-name system: choose to register an account with username and profile picture. Custom Stickers: Create stickers to add creativity and zest to your pictures. Flash Interaction: All private chat and group chat history will be deleted after 24 hours on Ogle Chat. == Controversies == Users can post anything on Ogle using text, photos, and videos. As a result, some Ogle user's sense of anonymity, posts have targeted specific schools and students with abusive and hurtful content. The Ogle app's user anonymity makes it difficult for school officials to quickly investigate issues that occur within the Ogle app. On March 28, 2016, three people were arrested after violent threats were made against an Anaheim high school. 18-year-old Miguel Meza was arrested Sunday afternoon during a traffic stop, along with his passenger, 23-year-old Johnny Aguilar. Police said both men had loaded handguns. Aguilar was also accused of violating his probation. "It is concerning the fact that they did have firearms, but we don't have a crystal ball. We can't determine if they possessed those firearms to engage in some kind of school violence or if they had it for another reason," Sgt. Daron Wyatt with the Anaheim Police Department said. Officials said Meza and Aguilar have known gang ties and detectives began investigating Meza after threats were made against the school on Ogle. On February 29, 2016, Santa Cruz County sheriff's deputies arrested a 16-year-old Aptos High School student Friday, accused of making an online threat of gun violence at Aptos High and Monte Vista Christian."He basically told detectives that it was all a joke. It's not a joke. You have multiple resources being spent to investigate these cases," said Santa Cruz County Sheriff's Sgt. Roy Morales. The schools remained open throughout the week, with a huge police presence on campus. In an anonymous emailed statement to the Daily Pilot on Thursday, the "Ogle team" said: "We are aware of the concern, and cyberbullying is absolutely NOT our intention for the app. Our goal for this app is to create a free and safe community space for students, for a better communication. We are currently working around the clock to improve the app. As a matter of fact, we are also in contact with local police departments, anti-bullying organizations and local high schools to try to help the students." In response to these incidents, Ogle expressed that they takes the safety of its users seriously and does not condone any type of behavior that is illegal or in violation of its content policies. The company also said it has instituted a content moderation team to increase review and identify and remove inappropriate content, and take action against “those who violate our community guidelines.”

    Read more →
  • Absher (application)

    Absher (application)

    Absher (Arabic: أبشر ‘Absher, roughly meaning "good tidings" or "yes, done") is a smartphone application and web portal which allows citizens and residents of Saudi Arabia to use a variety of governmental services. Amongst several other services with the Absher app, it can be used to apply for jobs and Hajj permits, passport info can be updated, and electronic crimes can be reported. The application provides around 280 services for residents of Saudi Arabia including but not limited to making appointments, renewing passports, residents' cards, IDs, driver's licenses and others, and, controversially, enables Saudi men to track the whereabouts of women they control as part of the country's male guardianship system. The app can be downloaded from the Google Play Store and Apple App Store and is supplied by the Saudi Interior Ministry. According to the Ministry of the Interior, Absher has more than 20 million users. As of February 2019, Absher has been downloaded 4.2 million times from the App Store. Some services provided through Absher can also be accessed through the website absher.sa. In March 2021, Saudi Arabia launched the digital version of the Absher for individuals app through which the users can download a copy of their digital ID. Then, new services were added to the platform such as online birth and death registration services, requesting amendments to academic credentials, correcting names in English and marital status and requesting civil records of children. == Impact on women's rights == The app has been criticized by various human rights activists, human rights organisations and international communities. The US and European countries have also condemned the app and urged the kingdom to end its male guardianship system. Absher gained media attention in 2019 for its functions supporting the Saudi policy of male guardianship following an investigation by Business Insider. The app allows for designated guardians to receive notifications if a woman under their guardianship passes through an airport and subsequently gives them the option to withdraw her right to travel. In a few cases, this system has been circumvented by women who have been able to gain control over its settings and use it to allow themselves to travel. US Senator Ron Wyden of Oregon wrote a letter to the CEO's of Apple and Google, criticizing the app and demanding for its removal immediately. Wyden said "American companies should not enable or facilitate the Saudi government's patriarchy," and called the Saudi system of control over women "abhorrent". According to the EU lawmakers, current rules imposed over the women by the Saudi government make women “second-class citizens”. The lawmakers also asked the EU states to continue to build pressure on Riyadh so as to improve the conditions of women and human rights. Amnesty International and Human Rights Watch accused Apple and Google of helping "enforce gender apartheid" by hosting the app. US congresswomen Rep. Katherine Clark and Rep. Carolyn B. Maloney condemned the kingdom's male guardianship system that reflected from the app, calling Absher a "patriarchal weapon" and asking for its removal. In response to the criticism received by Absher, Apple chief executive officer Tim Cook stated in February 2019 that he intended to investigate the situation. Similarly, Google announced that it would also review the application. After a prompt review, Google declined to remove the app from Google Play, citing that it did not violate the agreed upon terms and conditions of the store. Saudi doctor Khawla Al-Kuraya supported this app an editorial in Bloomberg News. Kuraya wrote that Absher helped Saudi women avoid governmental bureaucracy as it allows their male guardians to process their travel permits anywhere and anytime through Absher. Although she believes that the guardianship system needs to be reconsidered, she thinks that Absher is an important step towards facilitating women-guardians related issues in Saudi Arabia. Absher manager Atiyah Al-Anazy announced in 2019 that two million women were using the application in Saudi Arabia to facilitate their transactions. Some female users stated that the application has made their movement and travel-related issues easier. New measures were introduced that year to allow Saudi women above the age of 18 to travel without their male guardians, which ultimately released male authoritative rights on women. A law was subsequently passed allowing women over the age of 21 to receive a passport and travel without prior male permission.

    Read more →
  • Machine vision

    Machine vision

    Machine vision is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance. The overall machine vision process includes planning the details of the requirements and project, and then creating a solution. During run-time, the process starts with imaging, followed by automated analysis of the image and extraction of the required information. == Definition == Definitions of the term "Machine vision" vary, but all include the technology and methods used to extract information from an image on an automated basis, as opposed to image processing, where the output is another image. The information extracted can be a simple good-part/bad-part signal, or more a complex set of data such as the identity, position and orientation of each object in an image. The information can be used for such applications as automatic inspection and robot and process guidance in industry, for security monitoring and vehicle guidance. This field encompasses a large number of technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision is practically the only term used for these functions in industrial automation applications; the term is less universal for these functions in other environments such as security and vehicle guidance. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of basic computer science; machine vision attempts to integrate existing technologies in new ways and apply them to solve real world problems in a way that meets the requirements of industrial automation and similar application areas. The term is also used in a broader sense by trade shows and trade groups such as the Automated Imaging Association and the European Machine Vision Association. This broader definition also encompasses products and applications most often associated with image processing. The primary uses for machine vision are automatic inspection and industrial robot/process guidance. In more recent times the terms computer vision and machine vision have converged to a greater degree. See glossary of machine vision. == Imaging based automatic inspection and sorting == The primary uses for machine vision are imaging-based automatic inspection and sorting and robot guidance.; in this section the former is abbreviated as "automatic inspection". The overall process includes planning the details of the requirements and project, and then creating a solution. This section describes the technical process that occurs during the operation of the solution. === Methods and sequence of operation === The first step in the automatic inspection sequence of operation is acquisition of an image, typically using cameras, lenses, and lighting that has been designed to provide the differentiation required by subsequent processing. MV software packages and programs developed in them then employ various digital image processing techniques to extract the required information, and often make decisions (such as pass/fail) based on the extracted information. === Equipment === The components of an automatic inspection system usually include lighting, a camera or other imager, a processor, software, and output devices. === Imaging === The imaging device (e.g. camera) can either be separate from the main image processing unit or combined with it in which case the combination is generally called a smart camera or smart sensor. Inclusion of the full processing function into the same enclosure as the camera is often referred to as embedded processing. When separated, the connection may be made to specialized intermediate hardware, a custom processing appliance, or a frame grabber within a computer using either an analog or standardized digital interface (Camera Link, CoaXPress). MV implementations also use digital cameras capable of direct connections (without a framegrabber) to a computer via FireWire, USB or Gigabit Ethernet interfaces. While conventional (2D visible light) imaging is most commonly used in MV, alternatives include multispectral imaging, hyperspectral imaging, imaging various infrared bands, line scan imaging, 3D imaging of surfaces and X-ray imaging. Key differentiations within MV 2D visible light imaging are monochromatic vs. color, frame rate, resolution, and whether or not the imaging process is simultaneous over the entire image, making it suitable for moving processes. Though the vast majority of machine vision applications are solved using two-dimensional imaging, machine vision applications utilizing 3D imaging are a growing niche within the industry. The most commonly used method for 3D imaging is scanning based triangulation which utilizes motion of the product or image during the imaging process. A laser is projected onto the surfaces of an object. In machine vision this is accomplished with a scanning motion, either by moving the workpiece, or by moving the camera & laser imaging system. The line is viewed by a camera from a different angle; the deviation of the line represents shape variations. Lines from multiple scans are assembled into a depth map or point cloud. Stereoscopic vision is used in special cases involving unique features present in both views of a pair of cameras. Other 3D methods used for machine vision are time of flight and grid based. One method is grid array based systems using pseudorandom structured light system as employed by the Microsoft Kinect system circa 2012. === Image processing === After an image is acquired, it is processed. Central processing functions are generally done by a CPU, a GPU, a FPGA or a combination of these. Deep learning training and inference impose higher processing performance requirements. Multiple stages of processing are generally used in a sequence that ends up as a desired result. A typical sequence might start with tools such as filters which modify the image, followed by extraction of objects, then extraction (e.g. measurements, reading of codes) of data from those objects, followed by communicating that data, or comparing it against target values to create and communicate "pass/fail" results. Machine vision image processing methods include; Stitching/Registration: Combining of adjacent 2D or 3D images. Filtering (e.g. morphological filtering) Thresholding: Thresholding starts with setting or determining a gray value that will be useful for the following steps. The value is then used to separate portions of the image, and sometimes to transform each portion of the image to simply black and white based on whether it is below or above that grayscale value. Pixel counting: counts the number of light or dark pixels Segmentation: Partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Edge detection: finding object edges Color Analysis: Identify parts, products and items using color, assess quality from color, and isolate features using color. Blob detection and extraction: inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. Neural network / deep learning / machine learning processing: weighted and self-training multi-variable decision making Circa 2019 there is a large expansion of this, using deep learning and machine learning to significantly expand machine vision capabilities. The most common result of such processing is classification. Examples of classification are object identification,"pass fail" classification of identified objects and OCR. Pattern recognition including template matching. Finding, matching, and/or counting specific patterns. This may include location of an object that may be rotated, partially hidden by another object, or varying in size. Barcode, Data Matrix and "2D barcode" reading Optical character recognition: automated reading of text such as serial numbers Gauging/Metrology: measurement of object dimensions (e.g. in pixels, inches or millimeters) Comparison against target values to determine a "pass or fail" or "go/no go" result. For example, with code or bar code verification, the read value is compared to the stored target value. For gauging, a measurement is compared against the proper value and tolerances. For verification of alpha-numberic codes, the

    Read more →
  • Esdat

    Esdat

    ESdat is a data management, analysis and reporting software for environmental and groundwater data, developed by EarthScience Information Systems (EScIS). It is used to manage many types of environmental data including laboratory chemistry (analytical results, QA data, lab sample planning, and electronic Chain of Custody), field chemistry (water, gas, and soil), hydrogeological data (groundwater, borehole and well construction, lithological, geotechnical and stratigraphic, and LNAPL), meteorological data (rain, wind, and temperature), emission data (dust deposition, HiVol, air quality, and noise) and logger data. Data can be compared against environmental standards or site-specific trigger levels to generate exceedence tables, time series graphs, maps, statistics, and other outputs. ESdat integrates with Power BI and ArcGIS and data can also be exported in a range of other database formats, including USEPA Regions 2,4 & 5, and NYS DEC. ESdat is used by environmental consultants, government, mining and industry for validation, interrogation, and reporting of data derived from complex environmental programs, such as contaminated sites, groundwater investigations, and regulatory compliance for landfills or mining operations.

    Read more →
  • Ernie Bot

    Ernie Bot

    Ernie Bot (Chinese: 文心一言, Pinyin: wénxīn yīyán), full name Enhanced Representation through Knowledge Integration, is an artificial intelligence chatbot developed by the Chinese technology company Baidu. Ernie Bot rivals GPT models in Chinese NLP tasks. It is built on the company's ERNIE series of large language models, which have been in development since 2019. The service was first launched for invited testing on March 16, 2023, and was released to the general public on August 31, 2023, after receiving approval from Chinese regulators. Since its public launch, Ernie Bot has undergone several updates, with newer versions like ERNIE 4.0 and 4.5 released to improve its capabilities. The service has seen rapid user adoption, reportedly reaching over 200 million users by April 2024. It has been integrated into various products, notably powering AI features for the Chinese release of Samsung's Galaxy S24 smartphones. As a product operating in China, Ernie Bot is subject to the country's censorship regulations. It has been observed to refuse answers to politically sensitive questions, such as those regarding CCP general secretary Xi Jinping, the 1989 Tiananmen Square protests and massacre, and other topics deemed taboo by the government. == History == Ernie Bot was initially released for invited testing on March 16, 2023. The live release demo was reported to have been prerecorded, which caused Baidu's stock to drop 10 percent on the day of the launch. The company's stock gained 14 percent the following day after analysts from Citigroup and Bank of America tested Ernie Bot and gave it positive preliminary reviews. On August 31, 2023, Ernie Bot was released to the public after receiving approval from Chinese regulatory authorities. By December 2023, Baidu announced the service had surpassed 100 million users. In January 2024, Hong Kong newspaper South China Morning Post reported that a university research lab linked to the People's Liberation Army (PLA) had tested Ernie Bot for military response scenarios. Baidu denied the allegations, stating it had no connection with the academic paper. That same month, Ernie was integrated into Samsung's Galaxy S24 lineup for its launch in China. The user base reportedly grew to 200 million by April 2024 and 300 million by June 2024. In September 2024, Baidu changed the chatbot's Chinese name from "Wenxin Yiyan" (文心一言) to "Wenxiaoyan" (文小言) to position it as a search assistant. On March 16, 2025, Baidu announced version 4.5 and the reasoning model ERNIE X1. The following month, at the Create2025 Baidu AI Developer Conference, the company released the Wenxin 4.5 Turbo and Wenxin X1 Turbo models, designed to be faster and less expensive to operate. == Development == Ernie Bot is based on Baidu's ERNIE (Enhanced Representation through Knowledge Integration) series of foundation models. The general training process begins with pre-training on large datasets, followed by refinement using techniques like supervised fine-tuning, reinforcement learning with human feedback, and prompt engineering. === Foundation models === ==== Ernie 3.0 ==== The model powering the initial launch of Ernie Bot. It was trained with 10 billion parameters on a 4-terabyte corpus consisting of plain text and a large-scale knowledge graph. ==== Ernie 3.5 ==== Released in June 2023. At the time of release, its performance was reported as "slightly inferior" to OpenAI's GPT-4. ==== Ernie 4.0 ==== Unveiled in October 2023 and released to paying subscribers in November. According to Baidu, this version featured improved performance over its predecessor, with information updated to April 2023. ==== Ernie X1 ==== Announced in March 2025, with Ernie X1 positioned as a specialized reasoning model. Baidu stated that performance improvements were achieved through new technologies such as "FlashMask" dynamic attention masking and a heterogeneous multimodal mixture-of-experts architecture. === Turbo Models === In June 2024, Baidu announced Ernie 4.0 Turbo. In April 2025, Ernie 4.5 Turbo and X1 Turbo were released. These models are optimized for faster response times and lower operational costs. == Service == In its subscription options, the professional plan gives users access to Ernie 4.0 with a payment either for a month or with reduced payment for auto-renewal per month. Meanwhile, Ernie 3.5 is free of charge. Ernie 4.0, the language model for Ernie bot, has information updated to April 2023. == Censorship == Ernie Bot is subject to the Chinese government's censorship regime. In public tests with journalists, Ernie Bot refused to answer questions about CCP general secretary Xi Jinping, the 1989 Tiananmen Square protests and massacre, the persecution of Uyghurs in China in Xinjiang, and the 2019–2020 Hong Kong protests. When queried about the origin of SARS-CoV-2, Ernie Bot stated that it originated among American vape users.

    Read more →
  • 1 Second Everyday

    1 Second Everyday

    1 Second Everyday (1SE) is an application developed by Cesar Kuriyama. The application allows the user to record one second of video every day and then chronologically edits (mashes) them together into a single film. It is compatible with iOS and Android. The idea of the application was developed by Kuriyama's 1 Second Everyday — Age 30 video. The application was launched in January 2013. 1 Second Everyday played a part in the plot of Chef and also became the inspiration for the 2014 short animated clip Feast. == Background == === Kuriyama's video === In February 2011, when Cesar Kuriyama turned 30, after saving money, he quit his job in an advertising firm and took a year off to travel. During this time, he started working on a project he called 1 Second Everyday. As part of the project, every day he recorded one second of video – something that was supposed to help him remember that day. He started the project because he was frustrated with his memory. He planned to stockpile the 365 one-second clips into one film to serve as a memento of his year. While working on the project Kuriyama realized that recording one second every day impacted the decisions he made in a positive way. After a year he made a 365-second clip out of his recordings. The video called 1 Second Everyday – Age 30, went viral. According to Kuriyama, he was initially inspired to take a year off from work by a TED talk given by Stefan Sagmeister called "The Power of Time Off." Kuriyama also delivered a TED talk about 1 Second Everyday in 2012 at TED 2012 in Long Beach California. === Kickstarter campaign === After completing his own video, Kuriyama decided to develop an application that would allow the users to record one second every day and compile their own videos. He developed a prototype of the application and then in 2012, he launched a Kickstarter campaign to raise funds for completing the application. The campaign became one of the most backed app campaigns in the history of Kickstarter. It was backed by 11,281 backers who pledged a total of $56,959 on an initial goal of $20,000. Following the completion of the Kickstarter campaign, he partnered with an application design studio in Brooklyn to develop the application. 1 Second Everyday was released two weeks after the completion of its Kickstarter campaign. == Application == The application was released for iOS on 10 January 2013. An Android-compatible version of the application was developed later. Using it, the user can record the videos in the application or they can select one second portions from their libraries. 1 Second Everyday dates every snippet. The user can also set alarms to remember to record their daily video. In order to compile a video, the user selects the seconds they want and the application creates a compilation video. The user can keep multiple timelines. It also allows users to post directly on social networks. The main interface in 1 Second Everyday is a calendar, which shows the user which days have snippets and which they can still fill in. In the beginning, 1 Second Everyday restricted the recording to one second. However, the developers later released Super Seconds, which allowed users to record an additional half a second video. In 2014, 1 Second Everyday Crowds was launched, which is an area in the application featuring compilations of second clips from different users. == In the media == The Kickstarter campaign of 1 Second Everyday was featured in Entrepreneur's 3 Innovative Tech Startups on Kickstarter Right Now in 2012. The application was featured in The New York Times, The Washington Post, Gawker and other media outlets. By the end of the launch day, it was in Top 10 Free Apps on App Store. It was also selected as the App of the Week on GeekWire in 2013. Several other one-second compilation videos were also posted on the Internet after Kuriyama's video gained media attention. Sam Cornwell, an English photographer documented his son Indigo's growth using a montage of one-second iPhone clips. He shot these clips every single day from the moment of birth right up to the baby's first birthday. According to Cornwell, he was inspired by Kuriyama's project. The video of Cornwell's son gained considerable media attention after it was posted on YouTube. Save the Children also made a video commercial based on a similar format that showed a British girl oblivious of the Syrian war end up being a refugee. 1SE was a finalist for the Fast Company Innovation by Design Award in 2015, but lost to Google Maps. In 2015, Google Android created a gallery, Leap Second 2015, with the help of Droga5 and Kuriyama. The gallery showcased how people around the world enjoyed the one extra second of their lives. Through the 1 Second Everyday app available at Google Play, people were able to submit their extra second, which were then vetted and added to the gallery. The viewers were able to view other celebratory seconds from around the world as well as searching for them using different hashtags.

    Read more →