Dave's Redistricting App (DRA) is an online web app originally created by Dave Bradlee that allows anyone to simulate redistricting a U.S. state's congressional and legislative districts. == Purpose == According to Bradlee, the software was designed to "put power in people's hands," and so that they "can see how the process works, so it's a little less mysterious than it was 10 years ago." Bradlee has noticed that many citizens are taking this process seriously and using his app to create legitimate redistricting maps that could be put in place. Some websites have called Bradlee the pioneer and cause of the rise of do-it-yourself redistricting. States such as Montana in 2021 allowed the general population to use it to submit redistricting proposals following the 2020 United States Census. Dave's Redistricting has frequently been mentioned as a resource that can be used to combat gerrymandering, given that the public has free access to it. Political science firms such as FiveThirtyEight have used the website to draw examples of gerrymandered districts, including on their famous Atlas of Redistricting. Dave Bradlee built the first generation of DRA. DRA 2020 is built by a small team of volunteers—Dave Bradlee, Terry Crowley, Alec Ramsay, and David Rinn—all with a shared passion for technology & democracy and all Microsoft veterans. Their mission is to empower civic organizations and citizen activists to advocate for fair congressional and legislative districts and increased transparency in the redistricting process. == Functions == Users can redraw the congressional and state legislative districts for all 50 states, the District of Columbia, and Puerto Rico using a variety of census and election datasets including Cook PVI. Maps can be optimized for different criteria. DRA 2020 added several major features to the first generation app: Sharing & collaborative editing of maps, like Google Docs Multiple statewide elections for all 50 states including the ability to import your own data Comprehensive analytics for evaluating and comparing maps Custom overlays, and Block-level editing DRA remains free to use. == Versions == 2.2: This uses Bing Maps, an outdated software that projects the districts of a single state onto a map of the United States. 2.5: After Bing Maps announced that it would no longer be updating for the foreseen future, the U.S. Map feature was removed. DRA 2020: At the end of 2018, a beta version of 2020 was released. This version that did not require Microsoft Silverlight and could be used in any web browser. DRA 2020 has been under continuous development since and is built using React (JavaScript library), Mapbox, OpenStreetMap, TypeScript, Node.js, Amazon Web Services, as well as many open source components, tools, and icons.
Radek Maneuver
The Radek Maneuver is a scale-up-then-scale-down tactic used in the administration of web services, specifically those deployed under a cloud computing paradigm (by a provider e.g. Amazon Elastic Compute Cloud or Microsoft Azure). == History == Developed by Olivier "Radek" Dabrowski in the mid-2010s, the Radek Maneuver was originally conceived of in using and maintaining applications running on a PaaS system. == Execution == The Radek Maneuver consists of a series of steps, usually executed via the PaaS or web portal interface. The tactic should be used when a service is misbehaving or otherwise experiencing errors, and the suspected cause is the underlying cloud layer, rather than the application layer. This includes networking issues and other "bad box" problems. The steps are as follows: Identify the application or service which is misbehaving. Increase the compute resource (number of CPU cores, amount of ram) for the instance on which the application is running. This is also known as scaling up. Wait for the application to re-deploy and stabilize. Scale back down to the original instance size. == Principle of action == This scale-up-scale-down method is understood to shift the application to a different physical machine underlying the PaaS service or application virtual machine. While this layer of the cloud computing stack is generally out of the access of an application developer (instead in the hands of the cloud provider), the maneuver allows troubleshooting and dodging errors in that layer.
Golden record (informatics)
In informatics, a golden record is the valid version of a data element (record) in a single source of truth system. It may refer to a database, specific table or data field, or any unit of information used. A golden copy is a consolidated data set, and is supposed to provide a single source of truth and a "well-defined version of all the data entities in an organizational ecosystem". Other names sometimes used include master source or master version. The term has been used in conjunction with data quality, master data management, and similar topics. (Different technical solutions exist, see master data management). == Master data == In master data management (MDM), the golden copy refers to the master data (master version) of the reference data which works as an authoritative source for the "truth" for all applications in a given IT landscape.
Tagsistant
Tagsistant is a semantic file system for the Linux kernel, written in C and based on FUSE. Unlike traditional file systems that use hierarchies of directories to locate objects, Tagsistant introduces the concept of tags. == Design and differences with hierarchical file systems == In computing, a file system is a type of data store which could be used to store, retrieve and update files. Each file can be uniquely located by its path. The user must know the path in advance to access a file and the path does not necessarily include any information about the content of the file. Tagsistant uses a complementary approach based on tags. The user can create a set of tags and apply those tags to files, directories and other objects (devices, pipes, ...). The user can then search all the objects that match a subset of tags, called a query. This kind of approach is well suited for managing user contents like pictures, audio recordings, movies and text documents but is incompatible with system files (like libraries, commands and configurations) where the univocity of the path is a security requirement to prevent the access to a wrong content. == The tags/ directory == A Tagsistant file system features four main directories: archive/ relations/ stats/ tags/ Tags are created as sub directories of the tags/ directory and can be used in queries complying to this syntax: tags/subquery/[+/subquery/[+/subquery/]]/@/ where a subquery is an unlimited list of tags, concatenated as directories: tag1/tag2/tag3/.../tagN/ The portion of a path delimited by tags/ and @/ is the actual query. The +/ operator joins the results of different sub-queries in one single list. The @/ operator ends the query. To be returned as a result of the following query: tags/t1/t2/+/t1/t4/@/ an object must be tagged as both t1/ and t2/ or as both t1/ and t4/. Any object tagged as t2/ or t4/, but not as t1/ will not be retrieved. The query syntax deliberately violates the POSIX file system semantics by allowing a path token to be a descendant of itself, like in tags/t1/t2/+/t1/t4/@ where t1/ appears twice. As a consequence a recursive scan of a Tagsistant file system will exit with an error or endlessly loop, as done by Unix find: This drawback is balanced by the possibility to list the tags inside a query in any order. The query tags/t1/t2/@/ is completely equivalent to tags/t2/t1/@/ and tags/t1/+/t2/t3/@/ is equivalent to tags/t2/t3/+/t1/@/. The @/ element has the precise purpose of restoring the POSIX semantics: the path tags/t1/@/directory/ refers to a traditional directory and a recursive scan of this path will properly perform. == The reasoner and the relations/ directory == Tagsistant features a simple reasoner which expands the results of a query by including objects tagged with related tags. A relation between two tags can be established inside the relations/ directory following a three level pattern: relations/tag1/rel/tag2/ The rel element can be includes or is_equivalent. To include the rock tag in the music tag, the Unix command mkdir can be used: mkdir -p relations/music/includes/rock The reasoner can recursively resolve relations, allowing the creation of complex structures: mkdir -p relations/music/includes/rock mkdir -p relations/rock/includes/hard_rock mkdir -p relations/rock/includes/grunge mkdir -p relations/rock/includes/heavy_metal mkdir -p relations/heavy_metal/includes/speed_metal The web of relations created inside the relations/ directory constitutes a basic form of ontology. == Autotagging plugins == Tagsistant features an autotagging plugin stack which gets called when a file or a symlink is written. Each plugin is called if its declared MIME type matches The list of working plugins released with Tagsistant 0.6 is limited to: text/html: tags the file with each word in
Flajolet–Martin algorithm
The Flajolet–Martin algorithm is an algorithm for approximating the number of distinct elements in a stream with a single pass and space-consumption logarithmic in the maximal number of possible distinct elements in the stream (the count-distinct problem). The algorithm was introduced by Philippe Flajolet and G. Nigel Martin in their 1984 article "Probabilistic Counting Algorithms for Data Base Applications". Later it has been refined in "LogLog counting of large cardinalities" by Marianne Durand and Philippe Flajolet, and "HyperLogLog: The analysis of a near-optimal cardinality estimation algorithm" by Philippe Flajolet et al. In their 2010 article "An optimal algorithm for the distinct elements problem", Daniel M. Kane, Jelani Nelson and David P. Woodruff give an improved algorithm, which uses nearly optimal space and has optimal O(1) update and reporting times. == The algorithm == Assume that we are given a hash function h a s h ( x ) {\displaystyle \mathrm {hash} (x)} that maps input x {\displaystyle x} to integers in the range [ 0 ; 2 L − 1 ] {\displaystyle [0;2^{L}-1]} , and where the outputs are sufficiently uniformly distributed. Note that the set of integers from 0 to 2 L − 1 {\displaystyle 2^{L}-1} corresponds to the set of binary strings of length L {\displaystyle L} . For any non-negative integer y {\displaystyle y} , define b i t ( y , k ) {\displaystyle \mathrm {bit} (y,k)} to be the k {\displaystyle k} -th bit in the binary representation of y {\displaystyle y} , such that: y = ∑ k ≥ 0 b i t ( y , k ) 2 k . {\displaystyle y=\sum _{k\geq 0}\mathrm {bit} (y,k)2^{k}.} We then define a function ρ ( y ) {\displaystyle \rho (y)} that outputs the position of the least-significant set bit in the binary representation of y {\displaystyle y} , and L {\displaystyle L} if no such set bit can be found as all bits are zero: ρ ( y ) = { min { k ≥ 0 ∣ b i t ( y , k ) ≠ 0 } y > 0 L y = 0 {\displaystyle \rho (y)={\begin{cases}\min\{k\geq 0\mid \mathrm {bit} (y,k)\neq 0\}&y>0\\L&y=0\end{cases}}} Note that with the above definition we are using 0-indexing for the positions, starting from the least significant bit. For example, ρ ( 13 ) = ρ ( 1101 2 ) = 0 {\displaystyle \rho (13)=\rho (1101_{2})=0} , since the least significant bit is a 1 (0th position), and ρ ( 8 ) = ρ ( 1000 2 ) = 3 {\displaystyle \rho (8)=\rho (1000_{2})=3} , since the least significant set bit is at the 3rd position. At this point, note that under the assumption that the output of our hash function is uniformly distributed, then the probability of observing a hash output ending with 2 k {\displaystyle 2^{k}} (a one, followed by k {\displaystyle k} zeroes) is 2 − ( k + 1 ) {\displaystyle 2^{-(k+1)}} , since this corresponds to flipping k {\displaystyle k} heads and then a tail with a fair coin. Now the Flajolet–Martin algorithm for estimating the cardinality of a multiset M {\displaystyle M} is as follows: Initialize a bit-vector BITMAP to be of length L {\displaystyle L} and contain all 0s. For each element x {\displaystyle x} in M {\displaystyle M} : Calculate the index i = ρ ( h a s h ( x ) ) {\displaystyle i=\rho (\mathrm {hash} (x))} . Set B I T M A P [ i ] = 1 {\displaystyle \mathrm {BITMAP} [i]=1} . Let R {\displaystyle R} denote the smallest index i {\displaystyle i} such that B I T M A P [ i ] = 0 {\displaystyle \mathrm {BITMAP} [i]=0} . Estimate the cardinality of M {\displaystyle M} as 2 R / ϕ {\displaystyle 2^{R}/\phi } , where ϕ ≈ 0.77351 {\displaystyle \phi \approx 0.77351} . The idea is that if n {\displaystyle n} is the number of distinct elements in the multiset M {\displaystyle M} , then B I T M A P [ 0 ] {\displaystyle \mathrm {BITMAP} [0]} is accessed approximately n / 2 {\displaystyle n/2} times, B I T M A P [ 1 ] {\displaystyle \mathrm {BITMAP} [1]} is accessed approximately n / 4 {\displaystyle n/4} times and so on. Consequently, if i ≫ log 2 n {\displaystyle i\gg \log _{2}n} , then B I T M A P [ i ] {\displaystyle \mathrm {BITMAP} [i]} is almost certainly 0, and if i ≪ log 2 n {\displaystyle i\ll \log _{2}n} , then B I T M A P [ i ] {\displaystyle \mathrm {BITMAP} [i]} is almost certainly 1. If i ≈ log 2 n {\displaystyle i\approx \log _{2}n} , then B I T M A P [ i ] {\displaystyle \mathrm {BITMAP} [i]} can be expected to be either 1 or 0. The correction factor ϕ ≈ 0.77351 {\displaystyle \phi \approx 0.77351} (OEIS: A244256) is found by calculations, which can be found in the original article. == Improving accuracy == A problem with the Flajolet–Martin algorithm in the above form is that the results vary significantly. A common solution has been to run the algorithm multiple times with k {\displaystyle k} different hash functions and combine the results from the different runs. One idea is to take the mean of the k {\displaystyle k} results together from each hash function, obtaining a single estimate of the cardinality. The problem with this is that averaging is very susceptible to outliers (which are likely here). A different idea is to use the median, which is less prone to be influences by outliers. The problem with this is that the results can only take form 2 R / ϕ {\displaystyle 2^{R}/\phi } , where R {\displaystyle R} is integer. A common solution is to combine both the mean and the median: Create k ⋅ l {\displaystyle k\cdot l} hash functions and split them into k {\displaystyle k} distinct groups (each of size l {\displaystyle l} ). Within each group use the mean for aggregating together the l {\displaystyle l} results, and finally take the median of the k {\displaystyle k} group estimates as the final estimate. The 2007 HyperLogLog algorithm splits the multiset into subsets and estimates their cardinalities, then it uses the harmonic mean to combine them into an estimate for the original cardinality.
ActivTrak
ActivTrak is an American company that produces workforce analytics and productivity software. The company was founded in 2009 by Birch Grove Software and is headquartered in Austin, Texas. The company has raised US$77.5 million in funding and is backed by Sapphire Ventures and Elsewhere Partners. == History == ActivTrak was founded in 2009 by Herb Axilrod and Anton Seidler in Dallas, Texas. ActivTrak's first on-demand software product launched in 2012, and the workforce analytics platform launched in 2015. It uses data sourced from more than 9,500 customers and 900,000 users. In 2019, ActivTrak raised $20 million in a Series A round of funding with Elsewhere Partners, a growth-stage venture capital firm that principally invests in B2B startups. Rita Selvaggi assumed the role of CEO. In 2020, ActivTrak raised $50M in a Series B round of funding with Sapphire Ventures and Elsewhere Partners. The company also introduced the ActivTrak Productivity Lab, an online resource about workforce productivity research, industry benchmark data, and best practices. == Product == ActivTrak is a workforce analytics and productivity platform that uses reports, dashboards, and data analysis. The platform uses machine learning (AI) to collect and analyze user activity data and produce reports about workforce productivity. The software runs on Microsoft Windows, Mac, Chrome, Terminal Services, and VDI. It includes the ActivTrak Agent, which runs in the background and collects data. It responds to user activity, sensing mouse and keyboard movement in the active window(s) of the user's device. This data is collected and stored in a database that aggregates the data based on the user's request. ActivTrak does not utilize keystroke logging, content scraping, camera access, video recording or mobile device monitoring. The database leverages data analytics to generate account and team benchmarks, and identify productivity patterns and outliers. == Awards == Built In, 100 Best Midsize Places to Work in Austin, 2025 G2, Winter: Best Estimated ROI, High Performer, Best Relationship, Best Support, Users Most Likely to Recommend, Easiest Setup, Easiest Admin, Best Meets Requirements, Users Love Us, 2025 TrustRadius, Buyer’s Choice, 2025 Deloitte Technology Fast 500, No. 468 Fastest-Growing Company, 2024 Product Marketing Alliance, AI Marketing Innovation, 2024 Fortune Best Workplaces in Technology™, 2024 Inc. 5000, No. 2335 of America’s Fastest-Growing Private Companies, 2024 Fortune Best Workplaces in Texas™, 2024 Reworked IMPACT Gold Award: Most Innovative Workplace Productivity Solution, 2024 TrustRadius, Most Loved, 2024 Great Place To Work-Certified™, 2024 Inc. 5000 Regionals: Southwest, 2024 Brandon Hall Group, Best Advance in HR Predictive Analytics Technology, 2024
Information seeking
Information seeking is the process or activity of attempting to obtain information in both human and technological contexts. Information seeking is related to, but different from, information retrieval (IR). == Compared to information retrieval == Traditionally, IR tools have been designed for IR professionals to enable them to effectively and efficiently retrieve information from a source. It is assumed that the information exists in the source and that a well-formed query will retrieve it (and nothing else). It has been argued that laypersons' information seeking on the internet is very different from information retrieval as performed within the IR discourse. Yet, internet search engines are built on IR principles. Since the late 1990s a body of research on how casual users interact with internet search engines has been forming, but the topic is far from fully understood. IR can be said to be technology-oriented, focusing on algorithms and issues such as precision and recall. Information seeking may be understood as a more human-oriented and open-ended process than information retrieval. In information seeking, one does not know whether there exists an answer to one's query, so the process of seeking may provide the learning required to satisfy one's information need. == In different contexts == Much library and information science (LIS) research has focused on the information-seeking practices of practitioners within various fields of professional work. Studies have been carried out into the information-seeking behaviors of librarians, academics, medical professionals, engineers, lawyers and mini-publics(among others). Much of this research has drawn on the work done by Leckie, Pettigrew (now Fisher) and Sylvain, who in 1996 conducted an extensive review of the LIS literature (as well as the literature of other academic fields) on professionals' information seeking. The authors proposed an analytic model of professionals' information seeking behaviour, intended to be generalizable across the professions, thus providing a platform for future research in the area. The model was intended to "prompt new insights... and give rise to more refined and applicable theories of information seeking" (1996, p. 188). The model has been adapted by Wilkinson (2001) who proposes a model of the information seeking of lawyers. Recent studies in this topic address the concept of information-gathering that "provides a broader perspective that adheres better to professionals' work-related reality and desired skills." (Solomon & Bronstein, 2021). == Theories of information-seeking behavior == A variety of theories of information behavior – e.g. Zipf's Principle of Least Effort, Brenda Dervin's Sense Making, Elfreda Chatman's Life in the Round – seek to understand the processes that surround information seeking. In addition, many theories from other disciplines have been applied in investigating an aspect or whole process of information seeking behavior. A review of the literature on information seeking behavior shows that information seeking has generally been accepted as dynamic and non-linear (Foster, 2005; Kuhlthau 2006). People experience the information search process as an interplay of thoughts, feelings and actions (Kuhlthau, 2006). Donald O. Case (2007) also wrote a good book that is a review of the literature. Information seeking has been found to be linked to a variety of interpersonal communication behaviors beyond question-asking, to include strategies such as candidate answers. Robinson's (2010) research suggests that when seeking information at work, people rely on both other people and information repositories (e.g., documents and databases), and spend similar amounts of time consulting each (7.8% and 6.4% of work time, respectively; 14.2% in total). However, the distribution of time among the constituent information seeking stages differs depending on the source. When consulting other people, people spend less time locating the information source and information within that source, similar time understanding the information, and more time problem solving and decision making, than when consulting information repositories. Furthermore, the research found that people spend substantially more time receiving information passively (i.e., information that they have not requested) than actively (i.e., information that they have requested), and this pattern is also reflected when they provide others with information. == Wilson's nested model of conceptual areas == The concepts of information seeking, information retrieval, and information behaviour are objects of investigation of information science. Within this scientific discipline a variety of studies has been undertaken analyzing the interaction of an individual with information sources in case of a specific information need, task, and context. The research models developed in these studies vary in their level of scope. Wilson (1999) therefore developed a nested model of conceptual areas, which visualizes the interrelation of the here mentioned central concepts. Wilson defines models of information behavior to be "statements, often in the form of diagrams, that attempt to describe an information-seeking activity, the causes and consequences of that activity, or the relationships among stages in information-seeking behaviour" (1999: 250).