OpenCodePapers

task-1-grouping-on-ocw

Constrained ClusteringOnly Connect Walls Dataset Task 1 (Grouping)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeWasserstein Distance (WD)# Correct GroupsFowlkes Mallows Score (FMS)Adjusted Rand Index (ARI)Adjusted Mutual Information (AMI)# Solved WallsWasserstein Distance (WD)ModelNameReleaseDate
GPT-4 Technical Report✓ Link72.926943.429.132.8 7GPT-4 (5-shot)2023-03-15
GPT-4 Technical Report✓ Link73.426243.729.7 33.5 4GPT-4 (1-shot)2023-03-15
GPT-4 Technical Report✓ Link73.6249 42.8 28.5 32.33GPT-4 (100-shot)2023-03-15
GPT-4 Technical Report✓ Link73.727243.929.933.65GPT-4 (3-shot)2023-03-15
GPT-4 Technical Report✓ Link75.823941.527.230.76GPT-4 (0-shot)2023-03-15
GPT-4 Technical Report✓ Link80.6149 37.3 22.0 25.42GPT-3.5-turbo (5-shot)2023-03-15
GPT-4 Technical Report✓ Link80.914036.821.324.70GPT-3.5-turbo (3-shot)2023-03-15
GPT-4 Technical Report✓ Link81.213736.120.4 24.02GPT-3.5-turbo (10-shot)2023-03-15
GPT-4 Technical Report✓ Link82.3123 34.4 18.2 21.20GPT-3.5-turbo (1-shot)2023-03-15
GPT-4 Technical Report✓ Link82.5114 34.0 18.421.60GPT-3.5-turbo (0-shot)2023-03-15
Text Embeddings by Weakly-Supervised Contrastive Pre-training✓ Link83.8 ± .689 ± 633.1 ± .3 16.3 ± .419.5 ± .4 1 ± 0E5 (BASE)2022-12-07
Learning Word Vectors for 157 Languages✓ Link84.2 ± .580 ± 432.1 ± .315.2 ± .318.4 ± .40 ± 0FastText (Crawl)2018-02-19
Text Embeddings by Weakly-Supervised Contrastive Pre-training✓ Link84.4 ± .776 ± 532.3 ± .415.4 ± .518.5 ± .60 ± 0E5 (LARGE)2022-12-07
GloVe: Global Vectors for Word Representation✓ Link84.9 ± .468 ± 4 31.5 ± .314.4 ± .317.6 ± .40 ± 0GloVe2014-10-01
Learning Word Vectors for 157 Languages✓ Link85.5 ± .562 ± 330.4 ± .213.0 ± .2 15.8 ± .30 ± 0FastText (News)2018-02-19
MPNet: Masked and Permuted Pre-training for Language Understanding✓ Link86.3 ± .4 50 ± 429.4 ± .3 11.7 ± .414.3 ± .50 ± 0all-mpnet (BASE)2020-04-20
Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information✓ Link88.3 ± .533 ± 2 26.5 ± .28.2 ± .310.3 ± .30 ± 0BERT (LARGE)2019-11-25
Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information✓ Link89.5 ± .422 ± 225.1 ± .26.4 ± .38.1 ± .40 ± 0BERT (BASE)2019-11-25
Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset✓ Link1405285Human Performance2023-06-19
Deep contextualized word representations✓ Link55 ± 429.5 ± .311.8 ± .414.5 ± .40 ± 086.3 ± .6ELMo (LARGE)2018-02-15
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter✓ Link49 ± 429.1 ± .211.3 ± .3 14.0 ± .30 ± 086.7 ± .6DistilBERT (BASE)2019-10-02
RoBERTa: A Robustly Optimized BERT Pretraining Approach✓ Link29 ± 3 26.7 ± .2 8.4 ± .39.4 ± .40 ± 088.4 ± .4RoBERTa (LARGE)2019-07-26