这篇介绍一个有趣的新功能 – TryGetNonEnumeratedCount
本集提要
- 框架 : .NET 6
- 功能 : TryGetNonEnumeratedCount
说明
第一眼看到这功能还没细究时觉得有点奇怪,原本的 Enumerable.Count<TSource>(this IEnumerable<TSource> source) 内部实作不就已经针对 ICollection<T> 和 ICollection 最佳化了 (注1),那多出 TryGetNonEnumeratedCount 这玩意的意义是甚么? 经过一番瞎弄得到结论 : 有些没有实作 ICollection<T> 或 ICollection 介面的类别也是会带着类似 Count 属性或栏位的,而 TryGetNonEnumeratedCount 要处理的就是这些目标。
【注1】当一个 IEnumerable<T> 传递给 Enumerable.Count 的时候,Count 内部方法会检查这个传进来的执行个体是否有实作 ICollection<T> 或 ICollection , 若有他就会转型为该介面并直接从介面上取得 Count 属性值而无需列举整个序列。
先来为这个命名说文解字一番,通常 Try 开头的方法我都戏称为试试看方法,所以回传值通常是 bool ,表示试成功了没有,而我们真正想要获取的值往往都藉由 out parameter 传出来;NonEnumeratedCount 讲白一点就是 『只要不需经过列举行为的元素计数』。整体看起来就是只有在取得计数不经过列举的时候,才会回传 true。
Count 的流程
- 如果是可转型为 ICollection<T> 则转型后取得 ICollection<T>.Count 回传。
- 如果 1 不成立,source 如果能转型成 Iterator<TSource> (注2),则转型后呼叫 Iterator<TSource>.GetCount(false); (注3)
- 如果 2 也不成立,如果是可转型为 ICollection 则转型后取得 ICollection.Count 回传。
- 如果 3 也不成立,那就折手指,利用迴圈历遍所有元素。
【注2】这是一个私有抽象类别,Enumerable 许多方法都会回传这个类别的实体,例如 Where 可能会回传 ArrayWhereIterator<TSource>,IEnumerableWhereIterator<TSource> 等等。
【注3】这个方法的宣告是:public abstract int GetCount(bool onlyIfCheap); 从参数 onlyIfCheap 可以猜得出来如果传 true,就只允许便宜的方法 (也就是直接取得值,不经过其他可能会执行列举的方式)。这边传 false 就表示再贵都要算,这种贵的计算通常会历遍整个序列,也就是比较耗时;而在 true 的状况下,如果没有便宜的方式,这个方法通常会回传小于零的值。
TryGetNonEnumeratedCount 的流程
- 如果是可转型为 ICollection<T> 则转型后取得 ICollection<T>.Count 设定给 count 参数并回传 true。
- 如果 1 不成立,source 如果能转型成 Iterator<TSource>,则转型后呼叫 Iterator<TSource>.GetCount(true)。如果 GetCount 的结果大于等于零,则将此结果设定给 count 参数并回传 true。
- 如果 2 不成立 – 这边的不成立有两个可能 (1) 不能转型为 Iterator<TSource> (2) GetCount 结果小于 0 ,如果是可转型为 ICollection 则转型后取得 ICollection.Count 定给 count 参数并回传 true。
- 如果以上都不成立则将 count = 0,并回传 false。
经过以上比较,结论就是 TryGetNonEnumeratedCount 人如其名:绝不历遍序列。
所以在介意效能的状况下,会建议这样写,一网打尽,能快则快:
static int GetCount<T>(IEnumerable<T> source)
{
if (source.TryGetNonEnumeratedCount(out int count))
{
return count;
}
else
{
return source.Count();
}
}
试试谁会成功
我写了一个範例 NoneEnumeratedCountSample 简单来看谁会回传 true,大致测试以下事项:
static void Main(string[] args)
{
var random = new Random();
IEnumerable<int> range = Enumerable.Range(1, 1000);
IEnumerable<Person> people = range.Select(i => new Person
{
Name = $"Name_{i}",
Age = random.Next(10, 81)
});
Display(range);
Display(people);
Display(people.ToList());
Display(people.ToArray());
Display(people.Where(p => p.Age > 20));
Display(people.Select(p => p.Age));
Display(people.Where(p => p.Age > 20).Select(p => p.Age));
Display(people.GroupBy(x => x.Age / 10));
Display(people.DistinctBy(x => x.Age));
Display(people.Distinct(new PersonAgeComparer()));
Display(people.OrderBy(x => x.Age));
Display(people.Where(x => x.Age > 20).OrderBy(x => x.Age));
Display(GetStrings());
Display(GetStrings().Select(x => x));
static IEnumerable<string> GetStrings()
{
yield return "One";
yield return "Two";
yield return "Three";
}
var repeat = Enumerable.Repeat(new Person { Name = "Name", Age = 30 }, 10);
Display(repeat);
int[] array1 = { 1, 2, 3, 4, 5 };
int[] array2 = { 1, 3, 4, 7, 9 };
Display(array1.Intersect(array2));
Display(array1.Union(array2));
Display(array1.Except(array2));
var teachers = Program.teachers;
var students = Program.students;
Display(teachers.Join(students, t => t.ClassName, s => s.ClassName, (t, s) => new { t.TeacherName, s.StudentName }));
}
static void Display<T>(IEnumerable<T> source, [CallerArgumentExpression(nameof(source))] string expression = null)
{
bool success = source.TryGetNonEnumeratedCount(out int count);
if (success)
{
Console.WriteLine($"Source: {expression}, Try Result: {success}, Count: {count}");
}
}
输出结果:
Source: range, Try Result: True, Count: 1000
Source: people, Try Result: True, Count: 1000
Source: people.ToList(), Try Result: True, Count: 1000
Source: people.ToArray(), Try Result: True, Count: 1000
Source: people.Select(p => p.Age), Try Result: True, Count: 1000
Source: people.OrderBy(x => x.Age), Try Result: True, Count: 1000
Source: repeat, Try Result: True, Count: 10
ToList 和 ToArray 得到的 List<T> 和 T[] 应该普遍大家都知道它们有实作 ICollection<T>,剩下的比较有趣:
- Enumerable.Range 和 Enumerable.Repeat 可以便宜获得计数,原因是它们也有实作 ICollection<T>。
- Select 和 OrderBy 会依据来源的 IEnumerable<T> 能否便宜获得计数而定。这也是为什么 people.Select(p => p.Age) 会是 true ,而 people.Where(p => p.Age > 20).Select(p => p.Age) 会是 false 的原因。而有趣的也在这边,Select 和 OrderBy 这两个的结果并没有实作 ICollection<T>,效能差异就是出在类似这样的情形。
Benchmark
不免俗地还是来个效能测试
internal class Program
{
static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<NonEnumeratedCountBenchmark>();
}
}
public class NonEnumeratedCountBenchmark
{
private IEnumerable<Person> _people;
private IEnumerable<int> _people_selectAge;
private IEnumerable<Person> _people_orderby;
private IEnumerable<Person> _people_where_orderby;
private IEnumerable<int> _people_selectAge_orderby;
[GlobalSetup]
public void Setup()
{
var random = new Random();
List<Person> list = new List<Person>(1000);
for (int i = 0; i < 1000; i++)
{
list.Add(new Person
{
Name = $"Name_{i}",
Age = random.Next(10, 81)
});
}
_people = list;
_people_selectAge = _people.Select(p => p.Age);
_people_orderby = _people.OrderBy(p => p.Age);
_people_where_orderby = _people.Where(p => p.Age > 9).OrderBy(p => p.Age);
_people_selectAge_orderby = _people_selectAge.OrderBy(age => age);
}
[Benchmark]
public void CallCustomCount_People()
{
var count = CustomCount(_people);
}
[Benchmark]
public void CallCount_People()
{
var count = _people.Count();
}
[Benchmark]
public void CallCustomCount_SelectAge()
{
var count = CustomCount(_people_selectAge);
}
[Benchmark]
public void CallCount_SelectAge()
{
var count = _people_selectAge.Count();
}
[Benchmark]
public void CallCustomCount_OrderBy()
{
var count = CustomCount(_people_orderby);
}
[Benchmark]
public void CallCount_OrderBy()
{
var count = _people_orderby.Count();
}
[Benchmark]
public void CallCustomCount_WhereOrderBy()
{
var count = CustomCount(_people_where_orderby);
}
[Benchmark]
public void CallCount_WhereOrderBy()
{
var count = _people_where_orderby.Count();
}
[Benchmark]
public void CallCustomCount_SelectAgeOrderBy()
{
var count = CustomCount(_people_selectAge_orderby);
}
[Benchmark]
public void CallCount_SelectAgeOrderBy()
{
var count = _people_selectAge_orderby.Count();
}
static int CustomCount<T>(IEnumerable<T> source)
{
if (source.TryGetNonEnumeratedCount(out int count))
{
return count;
}
else
{
return source.Count();
}
}
}
public class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
测试结果:
// * Summary *
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4751/23H2/2023Update/SunValley3)
12th Gen Intel Core i7-1265U, 1 CPU, 12 logical and 10 physical cores
.NET SDK 9.0.200-preview.0.25057.12
[Host] : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
DefaultJob : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
| Method | Mean | Error | StdDev | Median |
|--------------------------------- |-------------:|-----------:|-----------:|-------------:|
| CallCustomCount_People | 3.442 ns | 0.1031 ns | 0.1058 ns | 3.409 ns |
| CallCount_People | 4.360 ns | 0.1211 ns | 0.2056 ns | 4.330 ns |
| CallCustomCount_SelectAge | 1.655 ns | 0.0489 ns | 0.0382 ns | 1.653 ns |
| CallCount_SelectAge | 1,269.816 ns | 29.5624 ns | 86.7013 ns | 1,253.890 ns |
| CallCustomCount_OrderBy | 19.590 ns | 1.2523 ns | 3.6924 ns | 17.444 ns |
| CallCount_OrderBy | 13.127 ns | 0.3050 ns | 0.5500 ns | 13.198 ns |
| CallCustomCount_WhereOrderBy | 636.515 ns | 12.7756 ns | 19.8901 ns | 639.559 ns |
| CallCount_WhereOrderBy | 628.010 ns | 11.7456 ns | 11.5357 ns | 628.727 ns |
| CallCustomCount_SelectAgeOrderBy | 3.471 ns | 0.0999 ns | 0.0885 ns | 3.498 ns |
| CallCount_SelectAgeOrderBy | 1,215.613 ns | 24.1754 ns | 45.4072 ns | 1,217.556 ns |
// * Summary *
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4751/23H2/2023Update/SunValley3)
12th Gen Intel Core i7-1265U, 1 CPU, 12 logical and 10 physical cores
.NET SDK 9.0.200-preview.0.25057.12
[Host] : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
DefaultJob : .NET 9.0.1 (9.0.124.61010), X64 RyuJIT AVX2
| Method | Mean | Error | StdDev |
|--------------------------------- |-------------:|-----------:|-----------:|
| CallCustomCount_People | 4.015 ns | 0.1194 ns | 0.2059 ns |
| CallCount_People | 3.537 ns | 0.1126 ns | 0.2030 ns |
| CallCustomCount_SelectAge | 1.507 ns | 0.0705 ns | 0.0754 ns |
| CallCount_SelectAge | 1,024.929 ns | 20.5122 ns | 28.7552 ns |
| CallCustomCount_OrderBy | 14.764 ns | 0.3236 ns | 0.4429 ns |
| CallCount_OrderBy | 11.504 ns | 0.2714 ns | 0.3432 ns |
| CallCustomCount_WhereOrderBy | 535.035 ns | 10.5585 ns | 12.5692 ns |
| CallCount_WhereOrderBy | 539.504 ns | 10.2631 ns | 25.7480 ns |
| CallCustomCount_SelectAgeOrderBy | 2.985 ns | 0.0838 ns | 0.0743 ns |
| CallCount_SelectAgeOrderBy | 1,013.673 ns | 20.0785 ns | 28.1473 ns |
有明显差异的在
- CallCustomCount_SelectAge vs CallCount_SelectAge
- CallCustomCount_SelectAgeOrderBy vs CallCount_SelectAgeOrderBy
Benchmark 的範例在此。