Cultural technology

Cultural technology

Cultural technology (Korean: 문화기술; Hanja: 文化技術; RR: munhwagisul) is a system used by South Korean talent agencies to promote K-pop culture throughout the world as part of the Korean Wave. The system was developed by Lee Soo-man, founder of talent agency and record company SM Entertainment. == History == === Coinage === During a speech at the Stanford Graduate School of Business in 2011, Lee said he coined the term "cultural technology" as a system about fourteen years prior, when S.M. Entertainment decided to promote its K-pop artists to all of Asia. In the late 1990s, Lee and his colleagues created a manual on cultural technology, which specified the steps needed to popularize K-pop artists outside South Korea. "The manual, which all S.M. employees are instructed to learn, explains when to bring in foreign composers, producers, and choreographers; what chord progressions to use in what country; the precise color of eyeshadow a performer should wear in a particular country; the exact hand gestures he or she should make; and the camera angles to be used in the videos (a three-hundred-and-sixty-degree group shot to open the video, followed by a montage of individual closeups)," according to The New Yorker. The term "cultural technology," apart from Lee's systemized definition, can be traced back to the lectures of Michael White, an Australian social worker, educator, and therapeutic theorist and his works Narrative Means to Therapeutic Ends (1990) and Maps of Narrative Practice (2007). Its usage may also date further back to French philosopher Michel Foucault (1977). South Korean computer scientist Kwangyun Wohn said he coined the term "culture technology" in 1994. Cultural technology has also been one of six technology initiatives of the South Korean government since 2001. In regards to cultural technology, the Korean Wave is considered one of the most successful outcomes of government support of exporting Korean entertainment products. === The Four Core Stages === The cultural technology system originally employed by SM Entertainment since the 1990s existed in four stages: Casting, Training, Producing, and Marketing/Managing. Each of these four stages were curated to help spread the Hallyu wave through the development of its artists, and are present in the strategies of many other South Korean talent agencies when creating, debuting, and marketing groups. ==== Casting ==== While the majority of K-pop idols are from South Korea, some are from Japan, China, or Thailand. Many of Korea's entertainment companies, such as SM's Global Auditions, Bighit's Hit It auditions, and YG's Next Generation, host worldwide auditions. Scouting and streetcasting are also common, with members like BTS's Jin recruited for their looks or other surface reasons. Sometimes, casting agents go to dance schools to recruit the top dancers to be trained further at the entertainment company. ==== Training ==== Idols train extensively before debut. They receive training in dance, vocal activities, presentation, and other areas that will benefit them in the industry. Oftentimes, this training will last for years at a time, and trainees are in the proverbial dungeon. Before debut, idols and groups attempt to gain fans through pre-debut activities. SM Entertainment has a system in place called SM Rookies, which is a pre-debut team that hosts concerts and releases videos that strengthen the fanbase of the group even before their first single is released. Other forms of pre-debut activities include featuring in other, more seasoned idols' videos—like Nu'est in Orange Caramel or Exo in Girls' Generation-TTS Twinkle or BTS in Jo Kwon. One particular method of pre-debut training is coupled with casting in production shows, like Sixteen and Produce 101, in which members for a final group are selected and trained. ==== Producing ==== The production of music is integral in culture technology. For cultural technology, production of music helps create differentiated content to set trends in the K-pop world—trends that vary from music to also costume, choreography, and music videos. SM in particular focuses heavily on the expansion globally. Some companies also outsource production to more internationally famed parties, like Cube Entertainment's partnership with Skrillex for 4minute's Act. 7. ==== Marketing/Managing ==== In the marketing and management stage, talent agencies seek to broaden their reach. Often, idols have potential for being actors and actresses in dramas, or perhaps hosts/permanent members of variety shows like Kim Hee-chul in Knowing Bros. This so-called omnidirectional marketing lineup ranges over lifestyle and seeks to reach many aspects of living, like music, TV, drama, entertainment, sports, and fashion. This is also where older groups find new life, like Super Junior. Companies are not complacent but experiment constantly to develop the best marketing for the best management system. Marketing also aspires to branch out to international audiences, sometimes via the implementation of variety shows. Despite being primarily in Korean, these variety shows are accessible to all due to the simplistic, easily understood nature of shows—game-oriented shows like Run BTS! or consistently subbed shows like Weekly Idol are popular in showing the fun-loving side of idols. == Evolution into New Culture Technology == In February 2016, SM hosted a press conference discussing the future of SM and its cultural technology. Lee Soo-man announced the implementation of New Culture Technology, an SM-specific system. While SM's cultural technology in the past relied on local, Korean artists like Rain and BoA, the updated model tries to embed more and more foreign singers from strategic markets into larger girl or boy bands. These imported singers are then used to promote their acts back in their respective home countries. New Culture Technology is five projects—SM Station, EDM, Digital Platforms, Rookies Entertainment, and MCN—and one experimental group, NCT. It is a convergence and expansion of SM's four core culture technologies developed and deals heavily with interaction and the desire to innovate through communication. === SM Station === SM announced their intention of creating a new song every week for 52 weeks. Through this constant output of music, they intend to stray away from conventional forms of music and show active movement in digital music market and physical album market through freely and continuously releasing music. Additionally, this SM Station will feature collaborations between artists, producers, composers, and company brands outside the SM label. The name of SM Station is both derived from the radio station and the metaphorical train station. === NCT === Neo Culture Technology (NCT) introduced the idea of "Interactive". SM company tried to connect the targeting market, customers and artist, in order to lead the K-pop culture. NCT (Neo Culture Technology) is the new artist group formed by SM that embodies the concepts of cultural technology. With the seemingly limitless combinations and groups, SM aspires to make the whole world a stage for NCT. Since 2023, there are six NCT groups, who debuted on the digital song sales: NCT U, NCT 127, NCT Dream, WayV, NCT DoJaeJung, and NCT Wish. As of October 2023, the group consists of 25 members: Johnny, Taeyong, Yuta, Kun, Doyoung, Ten, Jaehyun, Winwin, Jungwoo, Mark, Xiaojun, Hendery, Renjun, Jeno, Haechan, Jaemin, Yangyang, Chenle, Jisung, Sion, Riku, Yushi, Daeyoung, Ryo, and Sakuya. ScreaM Records ScreaM Records has been released by SM Entertainment as an EDM label since 2016 for "SM TOWN: New Culture Technology". ScreaM Records is made for "performances made to be enjoyed". It collaborates with inside and outside Korean well-known EDM DJs. ScreaM Records has first launched collaborated song "Wave" E-Mart's home electronics store, Electro Mart. "Our goal is to provide opportunities to producers who have yet to be discovered and produce world famous DJs from the Asian scene." a ScreaM Records representative said. == Three stages of globalization == According to Lee, there are three stages necessary to popularize Korean culture outside South Korea: exporting the product, collaborating with international companies to expand the product's presence abroad, and finally creating a joint venture with international companies. As part of their joint ventures with international companies, South Korean talent agencies may hire foreign composers, producers, and choreographers to ensure K-pop songs feel "local" to foreign countries.

DAVI

The Dutch Automated Vehicle Initiative (DAVI) is a research and demonstration initiative developing automated vehicles for use on public roads. The project is unique in that, besides simply making driverless cars, it also focuses on having automated vehicles share information among each other. The aim is to have the cars help to avoid traffic congestion by reducing the safety distance between the cars (from 2 seconds to 0.5 seconds) and avoiding sudden traffic slow-downs due to maneuvers undertaken by drivers.

Stochastic variance reduction

(Stochastic) variance reduction is an algorithmic approach to minimizing functions that can be decomposed into finite sums. By exploiting the finite sum structure, variance reduction techniques are able to achieve convergence rates that are impossible to achieve with methods that treat the objective as an infinite sum, as in the classical Stochastic approximation setting. Variance reduction approaches are widely used for training machine learning models such as logistic regression and support vector machines as these problems have finite-sum structure and uniform conditioning that make them ideal candidates for variance reduction. == Finite sum objectives == A function f {\displaystyle f} is considered to have finite sum structure if it can be decomposed into a summation or average: f ( x ) = 1 n ∑ i = 1 n f i ( x ) , {\displaystyle f(x)={\frac {1}{n}}\sum _{i=1}^{n}f_{i}(x),} where the function value and derivative of each f i {\displaystyle f_{i}} can be queried independently. Although variance reduction methods can be applied for any positive n {\displaystyle n} and any f i {\displaystyle f_{i}} structure, their favorable theoretical and practical properties arise when n {\displaystyle n} is large compared to the condition number of each f i {\displaystyle f_{i}} , and when the f i {\displaystyle f_{i}} have similar (but not necessarily identical) Lipschitz smoothness and strong convexity constants. The finite sum structure should be contrasted with the stochastic approximation setting which deals with functions of the form f ( θ ) = E ξ ⁡ [ F ( θ , ξ ) ] {\textstyle f(\theta )=\operatorname {E} _{\xi }[F(\theta ,\xi )]} which is the expected value of a function depending on a random variable ξ {\textstyle \xi } . Any finite sum problem can be optimized using a stochastic approximation algorithm by using F ( ⋅ , ξ ) = f ξ {\displaystyle F(\cdot ,\xi )=f_{\xi }} . == Rapid Convergence == Stochastic variance reduced methods without acceleration are able to find a minima of f {\displaystyle f} within accuracy ϵ > {\displaystyle \epsilon >} , i.e. f ( x ) − f ( x ∗ ) ≤ ϵ {\displaystyle f(x)-f(x_{})\leq \epsilon } in a number of steps of the order: O ( ( L μ + n ) log ⁡ ( 1 ϵ ) ) . {\displaystyle O\left(\left({\frac {L}{\mu }}+n\right)\log \left({\frac {1}{\epsilon }}\right)\right).} The number of steps depends only logarithmically on the level of accuracy required, in contrast to the stochastic approximation framework, where the number of steps O ( L / ( μ ϵ ) ) {\displaystyle O{\bigl (}L/(\mu \epsilon ){\bigr )}} required grows proportionally to the accuracy required. Stochastic variance reduction methods converge almost as fast as the gradient descent method's O ( ( L / μ ) log ⁡ ( 1 / ϵ ) ) {\displaystyle O{\bigl (}(L/\mu )\log(1/\epsilon ){\bigr )}} rate, despite using only a stochastic gradient, at a 1 / n {\displaystyle 1/n} lower cost than gradient descent. Accelerated methods in the stochastic variance reduction framework achieve even faster convergence rates, requiring only O ( ( n L μ + n ) log ⁡ ( 1 ϵ ) ) {\displaystyle O\left(\left({\sqrt {\frac {nL}{\mu }}}+n\right)\log \left({\frac {1}{\epsilon }}\right)\right)} steps to reach ϵ {\displaystyle \epsilon } accuracy, potentially n {\displaystyle {\sqrt {n}}} faster than non-accelerated methods. Lower complexity bounds. for the finite sum class establish that this rate is the fastest possible for smooth strongly convex problems. == Approaches == Variance reduction approaches fall within four main categories: table averaging methods, full-gradient snapshot methods, recursive estimator methods (e.g., SARAH), and dual methods. Each category contains methods designed for dealing with convex, non-smooth, and non-convex problems, each differing in hyper-parameter settings and other algorithmic details. === SAGA === In the SAGA method, the prototypical table averaging approach, a table of size n {\displaystyle n} is maintained that contains the last gradient witnessed for each f i {\displaystyle f_{i}} term, which we denote g i {\displaystyle g_{i}} . At each step, an index i {\displaystyle i} is sampled, and a new gradient ∇ f i ( x k ) {\displaystyle \nabla f_{i}(x_{k})} is computed. The iterate x k {\displaystyle x_{k}} is updated with: x k + 1 = x k − γ [ ∇ f i ( x k ) − g i + 1 n ∑ i = 1 n g i ] , {\displaystyle x_{k+1}=x_{k}-\gamma \left[\nabla f_{i}(x_{k})-g_{i}+{\frac {1}{n}}\sum _{i=1}^{n}g_{i}\right],} and afterwards table entry i {\displaystyle i} is updated with g i = ∇ f i ( x k ) {\displaystyle g_{i}=\nabla f_{i}(x_{k})} . SAGA is among the most popular of the variance reduction methods due to its simplicity, easily adaptable theory, and excellent performance. It is the successor of the SAG method, improving on its flexibility and performance. === SVRG === The stochastic variance reduced gradient method (SVRG), the prototypical snapshot method, uses a similar update except instead of using the average of a table it instead uses a full-gradient that is reevaluated at a snapshot point x ~ {\displaystyle {\tilde {x}}} at regular intervals of m ≥ n {\displaystyle m\geq n} iterations. The update becomes: x k + 1 = x k − γ [ ∇ f i ( x k ) − ∇ f i ( x ~ ) + ∇ f ( x ~ ) ] , {\displaystyle x_{k+1}=x_{k}-\gamma [\nabla f_{i}(x_{k})-\nabla f_{i}({\tilde {x}})+\nabla f({\tilde {x}})],} This approach requires two stochastic gradient evaluations per step, one to compute ∇ f i ( x k ) {\displaystyle \nabla f_{i}(x_{k})} and one to compute ∇ f i ( x ~ ) , {\displaystyle \nabla f_{i}({\tilde {x}}),} where-as table averaging approaches need only one. Despite the high computational cost, SVRG is popular as its simple convergence theory is highly adaptable to new optimization settings. It also has lower storage requirements than tabular averaging approaches, which make it applicable in many settings where tabular methods can not be used. === SARAH === The SARAH (stochastic recursive gradient) method maintains a recursive estimator of the gradient rather than storing a table of past gradients (as in SAGA) or computing periodic full-gradient snapshots (as in SVRG). At the start of an inner loop, a full gradient is computed at a reference point x ~ {\displaystyle {\tilde {x}}} : v 0 = ∇ f ( x ~ ) {\displaystyle v_{0}=\nabla f({\tilde {x}})} . For inner iterations, with a sampled index i k {\displaystyle i_{k}} , the gradient estimator and iterate are updated by: v k = ∇ f i k ( x k ) − ∇ f i k ( x k − 1 ) + v k − 1 , x k + 1 = x k − γ v k . {\displaystyle v_{k}=\nabla f_{i_{k}}(x_{k})-\nabla f_{i_{k}}(x_{k-1})+v_{k-1},\qquad x_{k+1}=x_{k}-\gamma v_{k}.} This recursion requires two component-gradient evaluations per step ∇ f i k ( x k ) {\displaystyle \nabla f_{i_{k}}(x_{k})} and ∇ f i k ( x k − 1 ) {\displaystyle \nabla f_{i_{k}}(x_{k-1})} but does not need to store per-sample gradients, resulting in lower memory cost than table-averaging methods. SARAH admits linear convergence for strongly convex functions and has been extended to more general nonconvex and composite problems. === SDCA === Exploiting the dual representation of the objective leads to another variance reduction approach that is particularly suited to finite-sums where each term has a structure that makes computing the convex conjugate f i ∗ , {\displaystyle f_{i}^{},} or its proximal operator tractable. The standard SDCA method considers finite sums that have additional structure compared to generic finite sum setting: f ( x ) = 1 n ∑ i = 1 n f i ( x T v i ) + λ 2 ‖ x ‖ 2 , {\displaystyle f(x)={\frac {1}{n}}\sum _{i=1}^{n}f_{i}(x^{T}v_{i})+{\frac {\lambda }{2}}\|x\|^{2},} where each f i {\displaystyle f_{i}} is 1 dimensional and each v i {\displaystyle v_{i}} is a data point associated with f i {\displaystyle f_{i}} . SDCA solves the dual problem: max α ∈ R n − 1 n ∑ i = 1 n f i ∗ ( − α i ) − λ 2 ‖ 1 λ n ∑ i = 1 n α i v i ‖ 2 , {\displaystyle \max _{\alpha \in \mathbb {R} ^{n}}-{\frac {1}{n}}\sum _{i=1}^{n}f_{i}^{}(-\alpha _{i})-{\frac {\lambda }{2}}\left\|{\frac {1}{\lambda n}}\sum _{i=1}^{n}\alpha _{i}v_{i}\right\|^{2},} by a stochastic coordinate ascent procedure, where at each step the objective is optimized with respect to a randomly chosen coordinate α i {\displaystyle \alpha _{i}} , leaving all other coordinates the same. An approximate primal solution x {\displaystyle x} can be recovered from the α {\displaystyle \alpha } values: x = 1 λ n ∑ i = 1 n α i v i {\displaystyle x={\frac {1}{\lambda n}}\sum _{i=1}^{n}\alpha _{i}v_{i}} . This method obtains similar theoretical rates of convergence to other stochastic variance reduced methods, while avoiding the need to specify a step-size parameter. It is fast in practice when λ {\displaystyle \lambda } is large, but significantly slower than the other approaches when λ {\displaystyle \lambda } is small. == Accelerated approaches == Accelerated variance reduction methods are built upon the standard methods above. The earliest approaches make use of proximal operators t

SqueezeNet

SqueezeNet is a deep neural network for image classification released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters while achieving competitive accuracy. Their best-performing model achieved the same accuracy as AlexNet on ImageNet classification, but has a size 510x less than it. == Version history == SqueezeNet was originally released on February 22, 2016. This original version of SqueezeNet was implemented on top of the Caffe deep learning software framework. Shortly thereafter, the open-source research community ported SqueezeNet to a number of other deep learning frameworks. On February 26, 2016, Eddie Bell released a port of SqueezeNet for the Chainer deep learning framework. On March 2, 2016, Guo Haria released a port of SqueezeNet for the Apache MXNet framework. On June 3, 2016, Tammy Yang released a port of SqueezeNet for the Keras framework. In 2017, companies including Baidu, Xilinx, Imagination Technologies, and Synopsys demonstrated SqueezeNet running on low-power processing platforms such as smartphones, FPGAs, and custom processors. As of 2018, SqueezeNet ships "natively" as part of the source code of a number of deep learning frameworks such as PyTorch, Apache MXNet, and Apple CoreML. In addition, third party developers have created implementations of SqueezeNet that are compatible with frameworks such as TensorFlow. Below is a summary of frameworks that support SqueezeNet. == Relationship to other networks == === AlexNet === SqueezeNet was originally described in SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. AlexNet is a deep neural network that has 240 MB of parameters, and SqueezeNet has just 5 MB of parameters. This small model size can more easily fit into computer memory and can more easily be transmitted over a computer network. However, it's important to note that SqueezeNet is not a "squeezed version of AlexNet." Rather, SqueezeNet is an entirely different DNN architecture than AlexNet. What SqueezeNet and AlexNet have in common is that both of them achieve approximately the same level of accuracy when evaluated on the ImageNet image classification validation dataset. === Model compression === Model compression (e.g. quantization and pruning of model parameters) can be applied to a deep neural network after it has been trained. In the SqueezeNet paper, the authors demonstrated that a model compression technique called Deep Compression can be applied to SqueezeNet to further reduce the size of the parameter file from 5 MB to 500 KB. Deep Compression has also been applied to other DNNs, such as AlexNet and VGG. == Variants == Some of the members of the original SqueezeNet team have continued to develop resource-efficient deep neural networks for a variety of applications. A few of these works are noted in the following table. As with the original SqueezeNet model, the open-source research community has ported and adapted these newer "squeeze"-family models for compatibility with multiple deep learning frameworks. In addition, the open-source research community has extended SqueezeNet to other applications, including semantic segmentation of images and style transfer.

Linear classifier

In machine learning, a linear classifier makes a classification decision for each object based on a linear combination of its features. A simpler definition is to say that a linear classifier is one whose decision boundaries are linear. Such classifiers work well for practical problems such as document classification, and more generally for problems with many variables (features), reaching accuracy levels comparable to non-linear classifiers while taking less time to train and use. == Definition == If the input feature vector to the classifier is a real vector x → {\displaystyle {\vec {x}}} , then the output score is y = f ( w → ⋅ x → ) = f ( ∑ j w j x j ) , {\displaystyle y=f({\vec {w}}\cdot {\vec {x}})=f\left(\sum _{j}w_{j}x_{j}\right),} where w → {\displaystyle {\vec {w}}} is a real vector of weights and f is a function that converts the dot product of the two vectors into the desired output. (In other words, w → {\displaystyle {\vec {w}}} is a one-form or linear functional mapping x → {\displaystyle {\vec {x}}} onto R.) The weight vector w → {\displaystyle {\vec {w}}} is learned from a set of labeled training samples. Often f is a threshold function, which maps all values of w → ⋅ x → {\displaystyle {\vec {w}}\cdot {\vec {x}}} above a certain threshold to the first class and all other values to the second class; e.g., f ( x ) = { 1 if w T ⋅ x > θ , 0 otherwise {\displaystyle f(\mathbf {x} )={\begin{cases}1&{\text{if }}\ \mathbf {w} ^{T}\cdot \mathbf {x} >\theta ,\\0&{\text{otherwise}}\end{cases}}} The superscript T indicates the transpose and θ {\displaystyle \theta } is a scalar threshold. A more complex f might give the probability that an item belongs to a certain class. For a two-class classification problem, one can visualize the operation of a linear classifier as splitting a high-dimensional input space with a hyperplane: all points on one side of the hyperplane are classified as "yes", while the others are classified as "no". A linear classifier is often used in situations where the speed of classification is an issue, since it is often the fastest classifier, especially when x → {\displaystyle {\vec {x}}} is sparse. Also, linear classifiers often work very well when the number of dimensions in x → {\displaystyle {\vec {x}}} is large, as in document classification, where each element in x → {\displaystyle {\vec {x}}} is typically the number of occurrences of a word in a document (see document-term matrix). In such cases, the classifier should be well-regularized. == Generative models vs. discriminative models == There are two broad classes of methods for determining the parameters of a linear classifier w → {\displaystyle {\vec {w}}} . They can be generative and discriminative models. Methods of the former model joint probability distribution, whereas methods of the latter model conditional density functions P ( c l a s s | x → ) {\displaystyle P({\rm {class}}|{\vec {x}})} . Examples of such algorithms include: Linear Discriminant Analysis (LDA)—assumes Gaussian conditional density models Naive Bayes classifier with multinomial or multivariate Bernoulli event models. The second set of methods includes discriminative models, which attempt to maximize the quality of the output on a training set. Additional terms in the training cost function can easily perform regularization of the final model. Examples of discriminative training of linear classifiers include: Logistic regression—maximum likelihood estimation of w → {\displaystyle {\vec {w}}} assuming that the observed training set was generated by a binomial model that depends on the output of the classifier. Perceptron—an algorithm that attempts to fix all errors encountered in the training set Fisher's Linear Discriminant Analysis—an algorithm (different than "LDA") that maximizes the ratio of between-class scatter to within-class scatter, without any other assumptions. It is in essence a method of dimensionality reduction for binary classification. Support vector machine—an algorithm that maximizes the margin between the decision hyperplane and the examples in the training set. Note: Despite its name, LDA does not belong to the class of discriminative models in this taxonomy. However, its name makes sense when we compare LDA to the other main linear dimensionality reduction algorithm: principal components analysis (PCA). LDA is a supervised learning algorithm that utilizes the labels of the data, while PCA is an unsupervised learning algorithm that ignores the labels. To summarize, the name is a historical artifact. Discriminative training often yields higher accuracy than modeling the conditional density functions. However, handling missing data is often easier with conditional density models. All of the linear classifier algorithms listed above can be converted into non-linear algorithms operating on a different input space φ ( x → ) {\displaystyle \varphi ({\vec {x}})} , using the kernel trick. === Discriminative training === Discriminative training of linear classifiers usually proceeds in a supervised way, by means of an optimization algorithm that is given a training set with desired outputs and a loss function that measures the discrepancy between the classifier's outputs and the desired outputs. Thus, the learning algorithm solves an optimization problem of the form arg ⁡ min w R ( w ) + C ∑ i = 1 N L ( y i , w T x i ) {\displaystyle {\underset {\mathbf {w} }{\arg \min }}\;R(\mathbf {w} )+C\sum _{i=1}^{N}L(y_{i},\mathbf {w} ^{\mathsf {T}}\mathbf {x} _{i})} where w is a vector of classifier parameters, L(yi, wTxi) is a loss function that measures the discrepancy between the classifier's prediction and the true output yi for the i'th training example, R(w) is a regularization function that prevents the parameters from getting too large (causing overfitting), and C is a scalar constant (set by the user of the learning algorithm) that controls the balance between the regularization and the loss function. Popular loss functions include the hinge loss (for linear SVMs) and the log loss (for linear logistic regression). If the regularization function R is convex, then the above is a convex problem. Many algorithms exist for solving such problems; popular ones for linear classification include (stochastic) gradient descent, L-BFGS, coordinate descent and Newton methods.

Social software engineering

Social software engineering (SSE) is a branch of software engineering that is concerned with the social aspects of software development and the developed software. SSE focuses on the socialness of both software engineering and developed software. On the one hand, the consideration of social factors in software engineering activities, processes and CASE tools is deemed to be useful to improve the quality of both development process and produced software. Examples include the role of situational awareness and multi-cultural factors in collaborative software development. On the other hand, the dynamicity of the social contexts in which software could operate (e.g., in a cloud environment) calls for engineering social adaptability as a runtime iterative activity. Examples include approaches which enable software to gather users' quality feedback and use it to adapt autonomously or semi-autonomously. SSE studies and builds socially-oriented tools to support collaboration and knowledge sharing in software engineering. SSE also investigates the adaptability of software to the dynamic social contexts in which it could operate and the involvement of clients and end-users in shaping software adaptation decisions at runtime. Social context includes norms, culture, roles and responsibilities, stakeholder's goals and interdependencies, end-users perception of the quality and appropriateness of each software behaviour, etc. The participants of the 1st International Workshop on Social Software Engineering and Applications (SoSEA 2008) proposed the following characterization: Community-centered: Software is produced and consumed by and/or for a community rather than focusing on individuals Collaboration/collectiveness: Exploiting the collaborative and collective capacity of human beings Companionship/relationship: Making explicit the various associations among people Human/social activities: Software is designed consciously to support human activities and to address social problems Social inclusion: Software should enable social inclusion enforcing links and trust in communities Thus, SSE can be defined as "the application of processes, methods, and tools to enable community-driven creation, management, deployment, and use of software in online environments". One of the main observations in the field of SSE is that the concepts, principles, and technologies made for social software applications are applicable to software development itself as software engineering is inherently a social activity. SSE is not limited to specific activities of software development. Accordingly, tools have been proposed supporting different parts of SSE, for instance, social system design or social requirements engineering. Consequently vertical market software, such as software development tools, engineering tools, marketing tools or software that helps users in a decision-making process can profit from social components. Such vertical social software differentiates strongly in its user-base from traditional social software such as Yammer.

Computational learning theory

In computer science, computational learning theory (or just learning theory) is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms. == Overview == Theoretical results in machine learning often focus on a type of inductive learning known as supervised learning. In supervised learning, an algorithm is provided with labeled samples. For instance, the samples might be descriptions of mushrooms, with labels indicating whether they are edible or not. The algorithm uses these labeled samples to create a classifier. This classifier assigns labels to new samples, including those it has not previously encountered. The goal of the supervised learning algorithm is to optimize performance metrics, such as minimizing errors on new samples. In addition to performance bounds, computational learning theory studies the time complexity and feasibility of learning . In computational learning theory, a computation is considered feasible if it can be done in polynomial time . There are two kinds of time complexity results: Positive results – Showing that a certain class of functions is learnable in polynomial time. Negative results – Showing that certain classes cannot be learned in polynomial time. Negative results often rely on commonly believed, but yet unproven assumptions, such as: Computational complexity – P ≠ NP (the P versus NP problem); Cryptographic – One-way functions exist. There are several different approaches to computational learning theory based on making different assumptions about the inference principles used to generalise from limited data. This includes different definitions of probability (see frequency probability, Bayesian probability) and different assumptions on the generation of samples. The different approaches include: Exact learning, proposed by Dana Angluin; Probably approximately correct learning (PAC learning), proposed by Leslie Valiant; VC theory, proposed by Vladimir Vapnik and Alexey Chervonenkis; Inductive inference as developed by Ray Solomonoff; Algorithmic learning theory, from the work of E. Mark Gold; Online machine learning, from the work of Nick Littlestone. While its primary goal is to understand learning abstractly, computational learning theory has led to the development of practical algorithms. For example, PAC theory inspired boosting, VC theory led to support vector machines, and Bayesian inference led to belief networks.