AI alignment

AI alignment

In the field of artificial intelligence (AI), alignment aims to steer AI systems toward a person's or group's intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system pursues unintended objectives. It is often difficult for AI designers to specify the full range of desired and undesired behaviors. Therefore, the designers often use simpler proxy goals, such as gaining human approval. But proxy goals can overlook necessary constraints or reward the AI system for merely appearing aligned. AI systems may also find loopholes that allow them to accomplish their proxy goals efficiently but in unintended, sometimes harmful, ways (reward hacking). Advanced AI systems may develop unwanted instrumental strategies, such as seeking power or self-preservation because such strategies help them achieve their assigned final goals. Furthermore, they might develop undesirable emergent goals that could be hard to detect before the system is deployed and encounters new situations and data distributions. Empirical research showed in 2024 that advanced large language models (LLMs) such as OpenAI o1 or Claude 3 sometimes engage in strategic deception to achieve their goals or prevent them from being changed. Some of these issues affect existing commercial systems such as LLMs, robots, autonomous vehicles, and social media recommendation engines. Some AI researchers argue that more capable future systems will be more severely affected because these problems partially result from high capabilities. Many prominent AI researchers and AI company leaders have argued or asserted that AI is approaching human-like (AGI) and superhuman cognitive capabilities (ASI), and could endanger human civilization if misaligned. These include "AI godfathers" Geoffrey Hinton and Yoshua Bengio and the CEOs of OpenAI, Anthropic, and Google DeepMind. These risks remain debated. AI alignment is a subfield of AI safety, the study of how to build safe AI systems. Other subfields of AI safety include robustness, monitoring, and capability control. Research challenges in alignment include instilling complex values in AI, developing honest AI, scalable oversight, auditing and interpreting AI models, and preventing emergent AI behaviors like power-seeking. Alignment research has connections to interpretability research, (adversarial) robustness, anomaly detection, calibrated uncertainty, formal verification, preference learning, safety-critical engineering, game theory, algorithmic fairness, and social sciences. == Objectives in AI == Programmers provide an AI system such as AlphaZero with an "objective function", in which they intend to encapsulate the goal(s) the AI is configured to accomplish. Such a system later populates a (possibly implicit) internal "model" of its environment. This model encapsulates all the agent's beliefs about the world. The AI then creates and executes whatever plan is calculated to maximize the value of its objective function. For example, when AlphaZero is trained on chess, it has a simple objective function of "+1 if AlphaZero wins, −1 if AlphaZero loses". During the game, AlphaZero attempts to execute whatever sequence of moves it judges most likely to attain the maximum value of +1. Similarly, a reinforcement learning system can have a "reward function" that allows the programmers to shape the AI's desired behavior. An evolutionary algorithm's behavior is shaped by a "fitness function". == Alignment problem == In 1960, AI pioneer Norbert Wiener described the AI alignment problem as follows: If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere effectively [...] we had better be quite sure that the purpose put into the machine is the purpose which we really desire. AI alignment refers to ensuring that an AI system's objectives match some target. The target is variously defined as the goals of the system's designers or users, widely shared values, objective ethical standards, legal requirements, or the intentions its designers would have if they were more informed and enlightened. In democratic AI alignment, the target is the values and preferences of median voters, which increases political legitimacy. AI alignment is an open problem for modern AI systems and is a research field within AI. Aligning AI involves two main challenges: carefully specifying the purpose of the system (outer alignment) and ensuring that the system adopts the specification robustly (inner alignment). Researchers also attempt to create AI models that have robust alignment, sticking to safety constraints even when users adversarially try to bypass them. === Specification gaming and side effects === To specify an AI system's purpose, AI designers typically provide an objective function, examples, or feedback to the system. But designers are often unable to completely specify all important values and constraints, so they resort to easy-to-specify proxy goals such as maximizing the approval of human overseers, who are fallible. As a result, AI systems can find loopholes that help them accomplish the specified objective efficiently but in unintended, possibly harmful ways. This tendency is known as specification gaming or reward hacking, and is an instance of Goodhart's law. As AI systems become more capable, they are often able to game their specifications more effectively. Specification gaming has been observed in numerous AI systems. OpenAI GPT models for programming—including in real-world cases—have been found to explicitly plan hacking the tests used to evaluate them to falsely appear successful (e.g., explicitly stating "let's hack"). When the company penalized this, many models learned to obfuscate their plans while continuing to hack the tests. Another system was trained to finish a simulated boat race by rewarding the system for hitting targets along the track, but the system achieved more reward by looping and crashing into the same targets indefinitely. A 2025 Palisade Research study found that when tasked to win at chess against a stronger opponent, some reasoning LLMs attempted to hack the game system, for example by modifying or entirely deleting their opponent. Some alignment researchers aim to help humans detect specification gaming and steer AI systems toward carefully specified objectives that are safe and useful to pursue. When a misaligned AI system is deployed, it can have consequential side effects. Social media platforms have been known to optimize their recommendation algorithms for click-through rates, causing user addiction on a global scale. Stanford researchers say that such recommender systems are misaligned with their users because they "optimize simple engagement metrics rather than a harder-to-measure combination of societal and consumer well-being". Explaining such side effects, Berkeley computer scientist Stuart J. Russell said that the omission of implicit constraints can cause harm: "A system [...] will often set [...] unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable. This is essentially the old story of the genie in the lamp, or the sorcerer's apprentice, or King Midas: you get exactly what you ask for, not what you want." Some researchers suggest that AI designers specify their desired goals by listing forbidden actions or by formalizing ethical rules (as with Asimov's Three Laws of Robotics). But Russell and Norvig argue that this approach overlooks the complexity of human values: "It is certainly very hard, and perhaps impossible, for mere humans to anticipate and rule out in advance all the disastrous ways the machine could choose to achieve a specified objective." Additionally, even if an AI system fully understands human intentions, it may still disregard them, because following human intentions may not be its objective (unless it is already fully aligned). === Pressure to deploy unsafe systems === Commercial organizations sometimes have incentives to take shortcuts on safety and to deploy misaligned or unsafe AI systems. For example, social media recommender systems have been profitable despite creating unwanted addiction and polarization. Competitive pressure can also lead to a race to the bottom on AI safety standards. For example, OpenAI has been sued for releasing a ChatGPT version that encouraged suicide for some unstable users, a behavior the company had overlooked amid a rushed product release. Similarly, in 2018, a self-driving car killed a pedestrian (Elaine Herzberg) after engineers disabled the emergency braking system because it was oversensitive and slowed development. === Risks from advanced misaligned AI === Some researchers are interested in aligning increasingly advanced AI systems, as progress in AI development is rapid, and industry and governments are trying to build advan

ISSCO Graphics

Integrated Software Systems Corporation (ISSCO), doing business as ISSCO Graphics, was an American software developer and publisher based in San Diego, California, and active from 1970 to 1986. They were best known for their enterprise graphics software packages, including Tellagraf, CueChart and Disspla. == History == ISSCO Graphics had considered acquiring Breakthrough Software, whose software focus involved PC DOS, as a means of getting into the PC arena, but backed off when Computer Associates made an offer to acquire ISSCO. By early 1987 it was reported that "Issco users breathe sigh of relief" that all was well. The ISSCO User's Group was founded in 1976. ISSCO, which was founded in 1970 by Peter Preuss, was acquired by Computer Associates in 1986. == Notable products == === Tellagraf === ISSCO's Tellagraf is an early software package designed to allow end-users to "turn out full color, professional quality charts" with initial results displayed on a screen, modified as needed, and then "a final 'hard-copy' can be made .. or made into 35mm color transparencies for projection onto a screen." Users of Tellagraf often had access to CueChart and Disspla software. Often computer sites having one had all three. Terminals with varying degrees of graphics, such as the DEC's VT100 and Tektronix's Tektronix 4xxx family of text and graphics terminals. were supported, and the software ran on popular computing platforms. Four years are important to Tellagraf's early history: 1978: ease of use 1980: graphic-artist quality 1982: introduction of CueChart, and recognition by IEEE. 1983: "quality graphics enters the mainstream of data processing with ..." Tellegraf was eventually acquired by Computer Associates and renamed CA-Tellegraf. SAS users found it helpful. Universities, research institutes and financial services firms were among early users. === Disspla === Disspla is a package of data plotting subroutines that can be used from high level languages. It was also acquired by Computer Associates. === Tellaplan === In 1983 ISSCO introduced Tellaplan, "a project planning, report and schedule charting system for Tell-A- Graf users in IBM MVS or CMS or Digital Equipment Corp. VAX computers" atop which they built "two visual project management software packages" three years later.

Fantavision

Fantavision is an animation program by Scott Anderson for the Apple II and published by Broderbund in 1985. Versions were released for the Apple IIGS (1987), Amiga (1988), and MS-DOS (1988). Fantavision allows the creation of vector graphics animations using the mouse and keyboard. The user creates frames, and the software generates the frames between them. Because this is done in real-time, it allows for creative exploration and quick changes. The program uses a graphical user interface in the style of the Macintosh with pull-down menus and black text on a white background. Advertisements claimed Fantavision a revolutionary breakthrough that brings the animation features of "tweening" and "transforming" to home computers. == Reception == Compute! in 1989 called Fantavision the best animation program for the IBM PC, although it noted the inability to draw curves. == Reviews == Games #70

Intel Threat Detection Technology

Intel Threat Detection Technology (TDT) is a CPU-level technology created by Intel in 2018 to enable host endpoint protections to use a CPU's low-level access to detect threats to a system. TDT consists of multiple components including Accelerated Memory Scanning, which uses the CPU's integrated GPU to scan memory, and Advanced Platform Telemetry, which uses processor-level activity monitoring to detect unusual activity. It is supported on sixth-generation or newer Intel Core CPUs and additional capabilities were added to the 11th generation Core processors. Intel TDT is integrated into several third-party anti-malware solutions including Microsoft Defender, Check Point Harmony Endpoint, CrowdStrike Falcon, and others. == Accelerated Memory Scanning == Accelerated Memory Scanning (also referred to as "Advanced Memory Scanning") uses the CPU's integrated GPU to scan memory for malicious code, instead of using the CPU directly. This improves system responsiveness during anti-malware scanning. and lowers power consumption. Features include pattern matching, using random forest decision trees, string extraction, entropy calculation, and Euclidean clustering. == Advanced Platform Telemetry == Advanced Platform Telemetry collects CPU-level telemetry to detect uncommon activity patterns which might be indicative of malware. The telemetry data is collected from the CPU performance monitoring unit (PMU) and doesn't require a large signature database to detect malware. Instead, it uses machine-learning based correlations to identify indicators of attack For example, Microsoft Defender is able to use TDT's Advanced Platform Telemetry features to detect processor usage patterns indicative of ransomware and cryptojacking with TDT so it can detect them.

Taimi

Taimi ( TAY-mee) is a dating app that caters to the LGBTQI+ community. The network matches its registered users based on their selected preferences and location. Originally an online dating service for gay men, by 2022 Taimi had become an app for all members of the LGBTQI+ community. It operates in more than 138 countries, including the US, UK, the Netherlands, Spain, Central and South America, Ukraine, and other European and Asian countries. Taimi runs on iOS and Android. The mobile app has a free and subscription-based premium version and offers a number of services for communication, including live streaming, chatting, and video calling. There is also an active blog that regularly posts articles and news about events of interest to the LGBTQ+ community. The application does not provide for non-Google e-mail log option, either phone number or Facebook account, during the registration process. The data controller for the non EU/UK users is based in a company, called Social Impact Inc., with its registered address at 1180 North Town Center Drive Suite 100, Las Vegas, Nevada, 89144, United States of America. == History == Taimi was launched in 2017 by Social Impact, Inc. in Las Vegas. Its founder, Alex Pasykov, originally called the app "Tame Me," a name that gradually morphed into Taimi. Over time, Taimi expanded into other countries, and expanding its reach to the LGBTQ+ community, so that, by 2022, it was fully inclusive of the entire queer community. In November 2020 the app was redesigned, with a new interface, branding, and logo. As of 2024, there are over 25 million registered users of Taimi worldwide. Pasykov states that he is an ally of the LGBTQ+ community and that he is focused on, among other things, partnering with NGOs to fight Homophobia and "regressive policies and laws" that negatively impact the community. == Features == Users register on the app and complete a profile, including personal information and preferences for compatibility, dating style, and relationship goals. An algorithm then finds and presents recommendations that a user accepts or rejects. Users are then free to chat via text or video with people they have connected with. Safety and security features include a two-step authentication process and an automated account verification along with a clear reporting system when breaches or policy violations occur. User responses to new features and policies drive changes and modifications that are made to all aspects of the site. == Partnerships and Collaborations == Taimi has a long history of collaborations and partnerships in Pride events, both in the US and abroad, including fund-raising efforts. Taimi has partnered with Rakuten Viber to create a bot focused on educating its members on key LGBTQ+ topics and to allow queer Viber users to connect. In 2023, Taimi collaborated with the Known Agency in an "America the Beautiful" campaign to shine a spotlight on current anti-LGBTQ+ policies and laws in a number of US states, and to counter these by highlighting the values and freedoms upon which America was founded. The campaign was nominated for The Drum Awards in the category "OOH For Good" and honored with the ANA Multicultural Excellence Award. Taimi also partnered with Goodparts, a queer-owned and operated retailer, in a "Body Beautiful" campaign focused on love and acceptance of all body types. In this campaign, well-known LGBTQ+ artists are providing artwork for Goodpart's product packaging. From October 31 to December 13, 2023, Taimi showed the "Taimi Moments" video, created in collaboration with Raygun Agency, on large screens between performances of LGBTQ+ artists Doja Cat, Ice Spice, and Doechii on their Scarlet Tour. In spring 2024, Taimi launched Queer Paradise, a series of live events in Southern California to celebrate diversity, sexual exploration, and dating fluidity. Each event in the series was curated to give the full spectrum of groups within the LGBTQ+ community a space to express their authentic selves. Taimi's partners for Queer Paradise include Hawtmess Productions, Eden Entertainment Group, Hump Events, Girls Gays & Theys, Damn Good Dyke Nights, and Gaybors Agency. In summer 2024, with support from GLAAD, Taimi has updated features and self-expression tools to better serve the LGBTQ+ people seeking connection in the app. Taimi allowed members to select multiple sexualities, unified the list of sexualities across all genders, added more pronoun options, and created a more inclusive and improved list of subcategories for non-binary users. Also, in summer 2024, Taimi has partnered with gender-affirming underwear brand Urbody to release a capsule collection. Focused on gender inclusivity and sexual fluidity, the capsule collection includes a range of underwear and compression tops intended to promote "joy, self-love and empowerment."

International Medical Education Directory

The International Medical Education Directory (IMED) was a public database of worldwide medical schools. The IMED was published as a joint collaboration of the Educational Commission for Foreign Medical Graduates (ECFMG) and the Foundation for Advancement of International Medical Education and Research (FAIMER). The information available in IMED was derived from data collected by the Educational Commission for Foreign Medical Graduates (ECFMG) throughout its history of evaluating the medical education credentials of international medical graduates. Using these data as a starting point, Foundation for Advancement of International Medical Education and Research (FAIMER) began developing IMED in 2001 and made it publicly available in April 2002. In April 2014, IMED was merged with the Avicenna Directory to create the World Directory of Medical Schools. The World Directory is now the definitive list of medical schools in the world, as IMED and Avicenna were discontinued in 2015.

Sanad (government app)

Sanad (Arabic: سند) is the official digital identity and e-government services application of the Hashemite Kingdom of Jordan. Developed and managed by the Ministry of Digital Economy and Entrepreneurship, the app provides a unified platform for accessing a range of public services and personal records digitally. == Overview == Launched in February 2020, Sanad is part of Jordan's broader digital transformation strategy aimed at improving public service delivery and enhancing administrative efficiency. The app allows users to authenticate their identity digitally and access over 550 services from more than 50 government and private sector entities. == Features == Sanad provides a wide array of services, including: Viewing and managing official digital documents Applying for government services (e.g., jordanian passport issuance or renewal, health insurance) Accessing personal records (e.g., pension, property ownership) Digitally signing documents Paying utility bills and traffic fines Receiving and tracking official notifications The app is available on iOS, Android, and HarmonyOS platforms and supports both Arabic and English languages. == Digital Identity == A core feature of Sanad is the digital identity system, which enables secure login and authentication for all integrated services. Users must activate their digital identity at designated Sanad stations across Jordan to access the full suite of services. == Adoption and Impact == As of 2025, more than 1.6 million Jordanians have activated their digital identities through Sanad. The app has played a significant role in streamlining government interactions and reducing the need for in-person visits, especially during the COVID-19 pandemic. == Recent Developments == In 2025, the Ministry launched an updated version of the app with enhanced user experience and new services, including the e-passport issuance feature.